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Optimal  resource  utilization  is  one  of  the  most  general  “meta”-settings  in  operations  research:  many  hard 
optimization  problems  can  be  casted  as  problems  of  optimal  resource  utilization.  Additional  challenges 
arc  introduced  by  uncertainties;  the  difficulties  arc  further  multiplied  in  a  dynamic  context.  This  project 
has  considered  a  class  of  discrete  and  combinatorial  optimal  resource  utilization  problems  under 
uncertainties  that  arise  in  the  context  of  the  optimal  stopping  problems.  In  addition,  as  a  generalization  of 
traditional  stochastic  formulations  that  optimize  the  expected  payoff  or  cost,  we  considered  risk  averse 
discrete  and  combinatorial  optimization  problems,  where  the  risk  of  the  stopping  decision  was  estimated 
using  a  coherent  or  convex  risk  measure.  In  particular,  we  developed  a  special  class  of  certainty 
equivalent  (CE)  measures  of  risk  that  can  be  represented  via  solution  of  a  specially  formulated 
(stochastic)  optimization  problem.  A  number  of  solution  techniques  for  discrete  and  combinatorial 
problems  involving  CE  measures  have  been  developed,  including  exact  methods  based  on  polyhedral 
approximations,  branch-and-bound  and  branch-and-cut  algorithms,  scenario  decomposition  techniques, 
and  combinatorial  branch-and-bound  methods  for  risk-averse  combinatorial  optimization  problems. 

Particularly,  the  developed  class  of  certainty  equivalent  (CE)  measures  of  risk  allows  builds  upon  a  new 
representation  for  coherent  and  convex  measures  of  risk  that  expresses  the  risk  measure  in  the  form  of 
infimal  convolution  of  some  kernel  function,  and,  importantly,  formalizes  a  key  idea  that  measure  of  risk 
is  a  solution  of  a  stochastic  programming  problem.  One  of  the  key  properties  of  this  new  representation  is 
that  it  admits  incorporation  in  stochastic  programming  problems  in  the  form  of  convex  constraints.  By 
selecting  the  kernel  function  in  this  representation  in  the  form  of  the  certainty  equivalent,  a  well-known 
construct  in  utility  theory  and  decision  making  under  uncertainty,  we  constructed  a  family  of  CE  convex 
nonlinear  measures  of  risk,  which  allow  for  direct  incorporation  of  decision-makers  preferences,  as 
expressed  by  his/her  utility  function,  into  downside  risk  measure,  and  also  encompass  a  number  of 
existing  in  literature  risk  measure,  such  as  Condi tional-Value-at-Risk,  Higher- Moment  Coherent  Risk 
measures,  etc.  The  corresponding  results  are  presented  in  [2], 

Implementation  of  the  developed  measures  of  risk  in  decision  making  problem  under  uncertainties  leads 
to  mathematical  programming  problems  with  a  specific  set  of  constraints.  A  number  of  computational 
methods  for  solving  such  problems  were  developed  in  the  course  of  the  project.  In  particular,  we 
considered  special  cases  of  CE  measures,  corresponding  to  the  choice  of  the  utility  function  in  the  form  of 
a  power  function  and  an  exponential  function.  In  the  case  of  power  utility  function,  the  corresponding 
certainty  equivalent  measures  of  risk  reduce  to  higher-moment  coherent  measures  of  risk,  which  are 
implementable  in  stochastic  programming  problems  via  p-order  cone  constraints.  p-Order  cones  represent 
a  generalization  of  well-known  second-order  cones,  but  unlike  the  latter,  they  arc  not  self-dual,  which 
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precludes  development  of  fast,  long-step  self-dual  interior  point  method  algorithms  for  solving  p-cone 
programming  problems.  To  this  end,  we  developed  solution  methods  based  on  polyhedral  approximations 
of  p-order  cones  and  subsequent  decomposition  of  the  obtained  approximating  linear  programming  (LP) 
problems.  It  has  been  shown  that  the  developed  method  allows  one  to  formulate  an  exact  solution  method 
for  p-cone  programming  problems,  with  iteration  complexity  that  is  on  par  with  state-of-the-art  first-order 
methods  for  second-order  programming  problems.  The  corresponding  results  arc  subsequently  used  in 
exact  branch- and-bound  algorithm  for  discrete  and  combinatorial  p-cone  programming  problems.  The 
utilization  of  polyhedral  approximations  of  p-cones  at  each  node  of  the  branch-and-bound  tree  allows  for 
taking  advantage  of  “warm-start”  capabilities  of  linear  programming  solvers,  and  subsequently  reduces 
solution  time  by  orders  of  magnitude,  compared  to  branch-and-bound  schemes  based  on  solving  the 
nonlinear  relaxation  of  integer  p-order  programming  problem  at  each  branch-and-bound  node  [4],  A 
separate  research  thrust  was  dedicated  to  development  of  branch-and-cut  techniques  for  integer  p-order 
cone  programming.  In  this  context,  mixed-integer  rounding  cuts  and  lifted  cuts  were  developed  in  [6],  A 
scenario  decomposition  technique  for  solving  large-scale  stochastic  programming  problems  with  risk 
measures  represented  in  the  form  of  infimal  convolution,  including  certainty  equivalent  measures  of  risk, 
was  proposed  in  [1],  Importantly,  this  method  has  been  proven  to  terminate  in  a  number  of  iterations  that 
does  not  exceed  the  number  of  scenarios,  a  significant  advantage  over  decomposition  methods  based  on 
supporting  hyperplane  representations,  where  number  of  iterations  could  be  exponential  in  the  size  of  the 
scenario  set.  In  [11],  a  number  of  methods  for  handling  a  special  class  of  nonlinear  convex  constraints 
were  proposed  as  a  generalization  of  earlier  developed  techniques  for  p-order  cone  programming 
problems. 

The  developed  models  and  solution  approaches  were  applied  to  problems  of  data  mining  and  machine 
learning  [8],  identification  or  robust  and  risk-averse  structures  in  graphs  and  combinatorial  structures  [5, 
7,  9,  12,  13].  In  papers  that  consider  risk-averse  combinatorial  problems,  a  number  of  combinatorial 
branch-and-bound  algorithms  were  developed  that  incorporated  solving  a  stochastic  programming 
problem  at  each  node  of  combinatorial  branch-and-bound  tree  so  as  to  obtain  a  bound  on  the  risk  of 
combinatorial  substructure  corresponding  to  the  branch-and-bound  node. 
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Abstract 

We  propose  a  two-stage  stochastic  programming  framework  for  designing  or  identifying  “re¬ 
silient”,  or  “repairable”  structures  in  graphs  whose  topology  may  undergo  a  stochastic  transforma¬ 
tion.  The  repairability  of  a  subgraph  satisfying  a  given  property  is  defined  in  terms  of  a  budget 
constraint,  which  allows  for  a  prescribed  number  of  vertices  to  be  added  to  or  removed  from  the 
subgraph  so  as  to  restore  its  structural  properties  after  the  observation  of  random  changes  to  the 
graph’s  set  of  edges.  A  two-stage  stochastic  programming  model  is  formulated  and  is  shown  to  be 
AfV -complete  for  a  broad  range  of  graph- theoretical  properties  that  the  resilient  subgraph  is  required 
to  satisfy.  A  general  combinatorial  branch-and-bound  algorithm  is  developed,  and  its  computational 
performance  is  illustrated  on  the  example  of  two-stage  stochastic  maximum  clique  problem. 

Keywords:  Maximum  subgraph  problem,  two-stage  stochastic  optimization,  combinatorial  branch- 
and-bound  algorithm,  stochastic  maximum  clique  problem. 


1  Introduction  and  motivation 

An  important  feature  to  incorporate  in  a  networked  system's  design  is  an  inherent  resilience  to  withstand 
random  structural  changes  that  affect  the  relationship  characteristics  between  its  components.  A  reliable 
system  should,  therefore,  possess  a  high  tolerance  against  a  broad  range  of  possible  (failure)  scenarios, 
and,  moreover,  be  constructed  in  such  a  way  that  its  properties  can  be  restored  within  available  resource 
limits. 

In  the  present  study  we  pursue  an  approach  that  regards  a  distributed  subsystem,  or  subgraph,  to  be 
resilient  if  it  can  be  “repaired”  at  a  minimum  (or  fixed)  cost  after  a  random  change  in  the  underlying 

*  Corresponding  author,  e-mail:  krokhmal@engineering. uiowa.edu. 


1 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


graph’s  topology.  More  specifically,  many  graph-theoretical  and  network  optimization  problems  consist 
in  finding  a  subgraph  with  prescribed  properties  that  has  the  largest  (respectively,  smallest)  size,  weight, 
etc.  Well-known  examples  include  the  shortest  path  problem,  maximum  clique/independent  set  problem, 
minimum  vertex  cover  problem,  and  so  on.  In  situations  when  the  topology  of  the  underlying  graph 
or  network  may  be  subject  to  changes  (e.g.,  deletions  of  vertices  and/or  edges),  the  “resilience”  of  the 
selected  subgraph  is  often  of  interest.  A  large  body  of  literature  has  been  accumulated  on  this  subject, 
where  various  interpretations  of  “reliability”,  “resilience”,  or  “robustness”  of  subgraphs  have  been  ex¬ 
plored  (see,  among  others  [11,  13,  22,  25,  28]).  Typically,  robustness  in  this  context  is  associated  with 
the  ability  of  the  selected  subgraph  to  satisfy  (exactly  or  to  a  certain  degree)  a  given  property,  or  perform 
a  given  function,  etc.,  after  deletion  of  edges  and/or  vertices.  Several  examples  include  network  flow 
control,  preservation  of  vertex  and  edge  connectivity,  maximization  of  overall  algebraic  connectivity, 
and  prevention  of  catastrophic  cascade  failures  [6-8,  13]. 

In  this  work  we  adopt  the  point  of  view  that  a  structure  in  a  network  or  graph  is  “resilient”  if  it  is 
“repairable”  with  respect  to  randomized  changes  in  the  graph's  topology.  Namely,  we  consider  the 
following  general  framework:  assume  that  the  given  graph  G  =  (V,  E )  may  undergo  a  randomized 
change  in  the  future,  resulting  in  G'  =  (V.  E'),  where  E'  is  generally  not  a  subset  of  E.  Then,  it  is  of 
interest  to  identify  vertex  subsets  S,  S'  Q  V  such  that  the  induced  subgraphs  G  [S]  and  G'[S'}  satisfy  a 
prescribed  property  IT,  with  additional  requirements: 

(i)  the  difference  between  sets  S  and  S'  is  within  a  prescribed  bound  M,  i.e.,  |.S'  \  .S’'|  +  \S'  \  ,Sj  <  M\ 

(ii)  the  size  of  S  and  the  expected  size  of  S'  should  be  as  large  as  possible. 

In  other  words,  the  problem  is  to  identify  such  a  set  S  that  G  [.S']  has  property  II  and  is  as  large  as  possible 
under  the  condition  that,  after  a  random  change  to  the  graph’s  set  of  edges,  the  set  S  may  be  modified 
or  “repaired”  to  form  a  set  S',  such  that  G'[S']  satisfies  II  and  the  expected  size  of  S'  is  also  as  large  as 
possible. 

The  described  framework  has  obvious  interpretations  in,  for  example,  the  defense  domain,  where  one 
may  be  interested  in  identifying  the  largest  networked  or  distributed  system  that  can  maintain  its  structure 
-  with,  perhaps,  necessary  repairs  -  under  adversarial  attacks. 

Mathematically,  the  described  framework  lends  itself  naturally  to  the  context  of  two-stage  stochastic 
optimization  [5,  19],  which  models  decision  making  process  in  the  presence  of  uncertainties  that  involves 
two  sequential  decisions.  The  first-stage  decision  is  made  before  the  actual  realization  of  uncertain 
factors  can  be  observed.  The  second-stage,  or  recourse  decision  is  made  upon  observing  the  realization  of 
uncertainties,  and  takes  into  account  both  the  preceding  first-stage  decision  and  the  observed  realization 
of  stochastic  parameters. 

Stochastic  recourse  problems  have  gained  much  attention  in  the  network  literature  due  to  their  versatility 
for  modeling  uncertainties.  Particular  emphasis  has  been  placed  on  network  problems  with  random 
elements  evidenced  in  forms  that  influence  the  overall  flow  distribution,  demands,  and  costs.  A  number 
of  applications  examine  stochastic  factors  in  the  context  of  vehicle  routing  and  network  flow  problems 
where  uncertainties  are  attributed  to  arc  capacities  or  node  demands  (see  e.g.,  [3,  9,  14,  15,  27]).  Several 
similar  considerations  utilized  a  two-stage  recourse  framework  to  enhance  the  design  of  stochastic  supply 
chain  networks  and  network  resource  allocation  [10,  24].  Other  studies  examined  the  preservation  of 
connections  between  vertices  when  the  edge  costs  are  uncertain  [7,  16],  as  well  as  decision  making  in 
routing  problems  with  stochastic  edge  failures  [26]. 


2 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


Although  uncertainty  in  the  aforementioned  studies  mostly  influenced  decisions  related  to  directed 
flows  and  routing,  less  focus  has  been  put  on  developing  two-stage  recourse  constructs  for  design¬ 
ing/identifying  graphs  that  arc  adept  at  maintaining  their  connection  properties  in  situations  when  ran¬ 
dom  factors  affect/alter/damage  their  original  physical  characteristics.  A  notable  non-recourse  problem 
of  finding  the  largest  subset  of  vertices  that  form  a  clique  with  a  specified  probability,  given  that  edges 
in  the  graph  can  fail  with  some  probabilities,  was  studied  in  [17].  A  similar  approach  in  application 
to  certain  clique  relaxations  was  pursued  in  [31].  In  this  work,  we  introduce  a  two-stage  stochastic  re¬ 
course  framework  for  identifying  “sustainable”  subgraphs  whose  structural  properties  are  influenced  by 
definite  edge  failures  and/or  construction  in  each  random  scenario  realization.  The  proposed  model  is 
general  and  in  principle  can  be  adapted  to  address  a  broad  range  of  structural  graph  properties,  along 
with  uncertainties  in  the  form  of  vertex  failures. 

The  remainder  of  the  article  is  organized  as  follows.  In  Section  2  we  discuss  the  deterministic  graph- 
theoretic  underpinnings  and  establish  a  mathematical  programming  representation  of  the  two-stage 
stochastic  recourse  maximum  subgraph  problem.  Section  3  presents  an  efficient  graph-based  (combina¬ 
torial)  branch-and-bound  solution  algorithm  for  instances  when  the  desired  subgraphs  possess  hereditary 
structural  properties.  Finally,  Section  5  considers  a  numerical  case  study  demonstrating  the  effective¬ 
ness  of  the  proposed  algorithm  for  solving  two-stage  stochastic  recourse  maximum  clique  (i.e.,  complete 
graph)  problems. 


2  Problem  definition 

In  this  section  we  present  a  formal  graph-theoretical  description  of  the  discussed  framework.  Before 
introducing  the  stochastic  model  that  represents  the  focus  of  the  present  work,  we  outline  the  relevant 
deterministic  concepts,  which  pertain  to  problems  involving  identification  of  the  largest  subgraph/subset 
of  a  system's  vertices  that  collectively  possess  a  specified  structural  property. 

2.1  Deterministic  maximum  subgraph  problem 

Let  G  =  (V.  E)  represent  an  undirected  graph  where  each  vertex  i  €  V  is  a  component  of  the  networked 
system,  and  an  edge  (i.j)  €  E  defines  a  connection/relation  between  vertices  /  and  j .  Then,  the  problem 
of  finding  the  largest  (sub)graph  S  C  V  of  vertices  with  a  prescribed  structural  property  n,  also  known 
as  the  maximum  subgraph  problem,  or  maximum  IT  problem ,  is  given  by 

max  {IS |  :  G[S]  3  n},  (1) 

■Scf 

where  G  [.S']  denotes  the  subgraph  of  G  induced  by  S,  i.e.,  a  graph  such  that  any  of  its  vertices  i,  j  arc 
connected  by  an  edge  if  and  only  if  (i,  j )  is  an  edge  in  graph  G.  Here  and  throughout  the  text  the  relation 
G[S]  B  n  stands  for  “G  [S]  satisfies  property  IT”  (we  also  say  that  S  is  a  U-subgraph  of  G);  similarly, 
G[S]  ^  n  represents  a  converse  statement. 

In  the  context  of  the  maximum  subgraph  problem  (1),  an  important  class  of  graph-theoretical  properties 
n  is  represented  by  properties  that  are  hereditary  with  respect  to  induced  subgraphs  (or  just  hereditary 
for  short).  Namely,  n  is  called  hereditary  with  respect  to  induced  subgraphs  if  for  any  graph  that  satisfies 
n,  removal  of  any  vertex  from  this  graph  results  in  an  induced  subgraph  that  also  satisfies  IT  [1,  4, 
30].  The  class  of  hereditary  properties  encompasses  many  well-known  and  important  graph-theoretical 
properties,  such  as  completeness,  independence,  planarity,  and  so  on. 
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The  practical  and  theoretical  significance  of  the  class  of  hereditary  properties  in  relation  to  the  maximum 
subgraph  problem  (1)  stems  from  the  fact  that  a  large  number  of  important  and  difficult  graph-theoretical 
problems  are  special  cases  of  (1)  when  If  is  hereditary  and  “meaningful”  in  some  sense.  Namely,  II  is 
called  nontrivial  if  it  is  satisfied  by  a  single-vertex  graph  yet  not  satisfied  by  every  graph,  and  is  called 
interesting  if  the  order  of  graphs  satisfying  If  is  unbounded  [30],  Then,  the  following  fundamental 
observation  regarding  problem  (1)  holds: 

Theorem  1  (Yannakakis  [30])  If  property  IT  is  hereditary  with  respect  to  induced  subgraphs,  nontriv¬ 
ial,  and  interesting,  then  the  maximum  subgraph  problem  (1)  is  NV -complete. 

In  many  practical  applications,  the  topology  of  graph  G  in  the  maximum  subgraph  problem  (1)  may  not 
always  be  assumed  constant,  and  is  subject  to  unpredictable,  or  stochastic  changes  (e.g.,  edge  and/or 
vertex  failures).  Once  graph  G  is  assumed  to  be  stochastic,  however,  formulation  (1)  becomes  ill-posed, 
since  it  does  not  provide  a  guarantee  or  conditions  under  which  the  selected  subgraph  G  [.S']  satisfies  the 
sought  property  IL  Therefore,  in  the  presence  of  uncertainties  formulation  (1)  of  the  maximum  subgraph 
problem  has  to  be  modified  so  as  to  explicitly  specify  the  conditions  under  which  its  solution  can  be 
considered  a  fl -subgraph  of  (stochastic)  graph  G.  One  common  approach  in  the  literature  is  to  require 
that  the  solution  of  an  optimization  problem  with  stochastic  data  satisfies  the  required  properties  with 
a  prescribed  probability;  an  application  of  this  approach  to  a  maximum  clique  problem  on  stochastic 
graphs  was  considered  in  [17],  In  the  present  endeavor,  we  require  that  the  solution  of  the  maximum 
subgraph  problem  on  a  stochastic  graph  is  “repairable”  in  some  sense. 

2.2  A  two-stage  stochastic  maximum  subgraph  problem 

Here  we  introduce  an  approach  for  determining  “resilient”  maximum  n -subgraphs  in  situations  when 
the  topology  of  the  underlying  graph  G  may  be  subject  to  uncertain  (random)  future  changes  that  is  based 
on  two-stage  stochastic  programming  and  which  was  tentatively  outlined  in  Section  1. 

Given  a  probability  space  (£2 ,  T ,  P),  where  £2  is  the  set  of  random  events,  T  is  the  sigma-algebra,  and  P 
is  the  probability  measure,  we  assume  that  the  topology  of  a  graph  G  =  (V,  E )  may  undergo  a  random 
transformation  at  some  moment  in  the  future,  resulting  in  an  updated  graph  G(a> )  =  (V,  E(a>)),  <y  e  £2. 
In  this  work,  it  is  assumed  for  simplicity  that  only  the  set  of  edges  E  —  G(a>)  may  be  dependent  on  the 
random  event  co,  while  the  set  of  vertices  V  is  constant.  As  it  will  be  seen  next,  the  proposed  formulation 
and  solution  method  can  be  generalized  to  account  for  possibility  of  a  stochastic  set  V. 

Traditionally  to  stochastic  programming  literature,  it  is  assumed  that  the  set  £2  is  finite,  £2  = 

{at i . oon},  such  that  P(<w^)  =  p £  >  0  for  k  =  1 . N ,  and  'Ylk  Pk  =  1-  Consequently, 

the  possible  changes  to  the  topology  of  graph  G  are  observed  in  the  form  of  N  discrete  scenar¬ 
ios  {G{u>\), . . . ,  G(con)},  where  G{a>]c)  =  (V,  E(a>k)).  For  notational  convenience,  we  will  denote 
Gjf  =  G(oj]c),  Efc  —  G(rv/C);  also,  to  emphasize  that  the  original  graph  G  represents  the  unchanged,  or 
“null”  state  of  a  distributed  system,  we  denote  Go  =  G  =  (F,  Go),  where  Eq  —  E  represents  the  initial 
set  of  edges  in  the  graph. 

Characterization  of  “resilient”  substructures  in  graphs  subjected  to  randomized  topology  changes  via  the 
formalism  of  two-stage  stochastic  programming  is  the  key  feature  of  the  proposed  approach.  In  general, 
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a  two-stage  stochastic  programming  model  may  be  presented  in  the  form 

min  /i  (x)  +  E  f2  (x,  y {co) ,  co) 

s.  t.  hi(x)<0,  (2) 

h2(x,  y (co),co)  <  0,  Vcd  e  S2. 

Here  x  represents  the  first-stage  decision/action  that  is  made  before  the  actual  realization  of  the  uncertain 
event  co  can  be  observed.  Associated  with  the  first-stage  decision  arc  the  first-stage  cost  f\  (x)  and  the 
first-stage  constraints  hi(x)  <  0  that  the  vector  x  has  to  satisfy.  Since  the  first-stage  decision  x  may 
not  be  optimal  for  every  given  possible  realization  of  co,  a  recourse,  or  second-stage  corrective  decision 
y  =  y(&>)  is  made  after  the  actual  realization  of  co  has  been  observed,  so  as  to  minimize  some  second- 
stage  cost  /2 (x,  y (co),  co).  Importantly,  the  second-stage  decision  must  also  satisfy  specific  second-stage 
constraints  h2(x,  y(a>),  co)  <  0  for  any  given  first-stage  x.  Note  that  the  second-stage  decision  depends 
explicitly  on  the  specific  realization  of  co  as  well  as  on  the  first-stage  decision  x.  In  turn,  the  first-stage 
decision  must  take  into  account  all  possible  realizations  of  the  random  element  co  and  the  corresponding 
subsequent  recourse  decisions  y{co).  This  interdependency  is  emphasized  by  the  following  “nested”,  or 
recourse  representation  of  the  (extensive)  form  of  two-stage  stochastic  programming  formulation  (2): 

min{/i(x)  +  EQ(x,w)  :  hi(x)  <  0},  (3a) 

where  Q  is  the  second-stage  function  that  represents  the  optimal  second-stage  cost  given  the  first-stage 
vector  x  and  the  observed  realization  co: 

0{x,co)  =  min{/2(x,  y (co).co)  :  h2(x,  y(co).  co)  <  0}.  (3b) 

According  to  the  above,  the  following  two-stage  framework  is  adopted  for  identification  of  “resilient” 
n -subgraphs  in  Go'- 

-  Given  a  graph  G  o  —  (V,  Eo),  find  a  set  of  vertices  So  Q  V  such  that  the  induced  subgraph  Go  [So] 
satisfies  II  {“first  stage”). 

-  Graph  Go  undergoes  a  randomized  change  of  topology.  It  is  assumed  that  the  resulting  graph 
Gk  =  (V,  Ek)  is  chosen  at  random  with  probability  pk  from  a  collection  of  graphs  {Gi , . . , ,  Gjy} 
( “observation  of  uncertainty  ”). 

-  For  any  given  realization  G /c ,  select  sets  A^  C  V  \  So  and  Af  C  So,  such  that  after  “augmenta¬ 
tion”  or  “repair”  of  the  original  set  So  the  resulting  set  S&, 

Sk  :=  (S0\ApU  A+, 

induces  a  subgraph  Gk[Sk]  on  G/c  that  satisfies  II  {“second,  or  recourse  stage”). 

-  Sets  So  and  A±  must  be  chosen  in  such  way  that  the  expected  size  of  II-subgraph  in  the  first  and 
second  stages  is  maximized,  and  sets  A  t-  contain  no  more  than  M  vertices, 

|A+|  +  |A^|  <M.  (4) 
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Then,  the  two-stage  stochastic  maximum  subgraph  (TSMS)  problem  can  be  stated  in  the  graph-theoretical 
formulation  as  follows: 


max  |S0|  +  Pk\$k\ 

(5a) 

keAf 

s.t.  Gk[Sk]  b  FI,  Vke{0}UAT 

(5b) 

\So\Sk\  +  \Sk\S0\<M.  Vk  e  AT 

(5c) 

Sk  c  V,  Vk  e  {0}  U  A f, 

(5d) 

where  Af  =  {1 . N}.  Obviously,  the  defined  above  delta-sets  A ^  are  related  to  the  second-stage  sets 

S) t  as 

A+  =  Sk  \  S(h  Af  =  S0\Sk,  he  AT. 

The  above  extended  formulation  of  the  two-stage  stochastic  programming  problem  can  be  presented  in 
the  recourse  form  similar  to  (3): 

max  |  |S0|  +  V]  PkQk(So)  ■  Go^o]  3  n| ,  (6a) 

5o-F  '  ke M  1 

where  the  second-stage  function  Q  k  has  the  form 


Qk(S)  =  max  {\Sk\  :  Gk[Sk]  B  U,  \S  \  Sk\  +  \Sk  \  S\  <  M). 
s  k^V 


(6b) 


Complexity  of  the  two-stage  stochastic  maximum  subgraph  problem  (5)-(6)  is  established  in  the  next 
two  propositions.  For  this,  consider  the  decision  version  of  the  two-stage  stochastic  maximum  sub¬ 
graph  problem  (5)-(6),  denoted  as  ((Go, . . . ,  G/v),  (pu  . . . ,  Pn),  M,  q ):  given  a  set  of  N  +  1  graphs 
Go, . . . ,  Gjv  such  that  F(Go)  =  . . .  =  V(Gn),  a  set  of  positive  rational  numbers  p i, . . . ,  Pn  such  that 
Pi  +  ...  +  PN  =  1,  an  integer  M  >  0,  and  a  rational  q  >  0,  determine  whether  graphs  G,  contain 
IT-subgraphs  Si  such  that  |  So  \  Sj  \  +  \Sj  \  S'ol  <  M  for  all  /  =  and  |5o|  +  Pk\$k\  —  cl- 

Similarly,  the  decision  version  of  the  maximum  subgraph  problem,  denoted  as  ( G,m ),  is  as  follows: 
given  a  graph  G  and  a  nonnegative  integer  m,  determine  whether  G  contains  a  n -subgraph  S  such  that 
\S\  >  m. 

Proposition  1  The  decision  version  of  the  two-stage  stochastic  maximum  subgraph  problem  (5)  is  A fV- 
complete,  provided  that  the  corresponding  maximum  subgraph  problem  is  MV -complete. 

Proof:  Noting  that  the  two-stage  stochastic  maximum  subgraph  problem  is  obviously  in  A fV,  we  prove 
its  Af'P -completeness  by  reduction  from  the  maximum  subgraph  problem.  Given  an  instance  ( G,m )  of 

the  maximum  subgraph  problem,  let  G*  —  G  for  i  —  0 . N,  select  arbitrary  rational  Pi  >  0  such 

that  p*  +  . . .  +  p*N  =  1,  an  arbitrary  integer  M*  >  0,  and  let  q*  =  2m.  Then,  a  collection  of  sets 

S*  =  . . .  =  S^  c  V(G*),  i  =  0 _ ,N,  satisfies  the  condition  |S*  \  S*\  +  \S*  \  S%\  <  M*,  and, 

moreover,  satisfies  |.S'(*|  +  JffL  l  p*  \  S*  \  >  m  +  m  =  q*  if  and  only  if  there  exists  S  c  V(G)  of  order 
\S\>m.  □ 

Next,  we  observe  that  for  any  given  first-stage  solution,  “repairing”  it  in  the  second  stage  via  solving 
the  second-stage  problem  (6b)  is  AFP-complctc  as  well.  To  this  end,  the  corresponding  decision  version 
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(Gfc,  S,  M,  q)  of  second-stage  maximum  subgraph  problem  is  formulated  as  follows:  given  a  second- 
stage  graph  Gfc,  a  first-stage  solution  S  C  V{Go)  =  V(G^),  and  integer  numbers  M  >  0  and  q  >  0, 
determine  if  a  Il-subgraph  C  17(G/C )  of  order  at  least  q  exists  such  that  |5  \  Sk\  +  |5jt  \  S\  <  M . 
Then,  the  next  observation  holds. 

Proposition  2  The  decision  version  of  the  second-stage  maximum  subgraph  problem  problem  (6b)  at 
any  scenario  k  e  J\f  is  MV -complete  if  property  IT  is  such  that  the  maximum  subgraph  problem  is 
MV -complete. 

Proof:  First,  note  that  the  second-stage  maximum  subgraph  problem  is  in  A fV.  Next,  observe  that  the 
order  of  the  Fl-subgraph  of  G/f  that  satisfies  the  budget  constraint  |  S  \  |  + 1 \  S  \  <  M  cannot  exceed 

min  {|  S  |  +  M,\  V\\.  Then,  for  a  given  instance  of  the  maximum  subgraph  problem  ( G,m ),  construct  an 
instance  (G£,  S* ,  M*  ,q*)  of  second-stage  maximum  subgraph  problem  with  Gf  =  G,  S*  =  {/}  for 
a  fixed  i  e  V(G),  M*  =  m  —  1,  and  q*  =  m.  The  order  of  the  largest  Fl-subgraph  of  thusly 
constructed  instance  (G£.  S* ,  M* ,q*)  of  second-stage  maximum  subgraph  problem  is  always  less  than 
or  equal  to  m  according  to  the  above  observation;  moreover,  it  is  equal  to  m  if  and  only  if  there  exists  a 
Id -subgraph  of  G  of  order  m  that  contains  vertex  i .  Therefore,  the  question  of  whether  a  graph  G  has  a 
fl-subgraph  with  m  vertices  can  be  answered  by  solving  no  more  than  \V{G)\  instances  (G,  {i }, 
of  second-stage  problem  as  described  above.  □ 

Note  that  while  the  introduced  model  assumes  a  common  property  n  for  the  subgraphs  selected  during 
both  decision  stages,  possible  extensions  may  include  distinct  properties  at  each  stage.  Further,  the 
model  may  be  enhanced  by  imposing  nonuniform  cost  structures  associated  with  selecting,  adding  and 
removing  the  vertices;  or  by  introducing  different  budgetary  restrictions  in  different  scenarios. 


3  A  combinatorial  branch-and-bound  solution  technique  for  the  two- 
stage  stochastic  maximum  subgraph  problem 


In  this  section  we  introduce  an  exact  graph-based,  or  combinatorial  branch-and-bound  (BnB)  algorithm 
for  solving  problem  (5)— (6).  We  emphasize,  however,  that  the  computational  efficiency  of  the  proposed 
method  -  as  with  all  BnB  schemes  -  depends  to  a  great  extent  on  the  specific  branching  and  bounding 
criteria  used  for  processing  of  the  search  space  with  respect  to  a  particular  property  FI.  An  illustration 
of  the  proposed  procedure  is  furnished  in  Section  5  for  the  case  when  II  represents  the  completeness 
property  of  a  subgraph. 

The  proposed  technique  for  solving  the  two-stage  stochastic  maximum  subgraph  problem  relies  on  the 
recourse  representation  (6a)— (6b)  and  employs  “nested”  BnB  algorithms  for  construction  of  first-  and 
second-stage  Il-subgraphs,  respectively,  that  satisfy  the  interrelationships  imposed  by  the  budgetary  re¬ 
pair  constraints  (5c).  Namely,  a  first-stage  BnB  procedure  identities  first-stage  subgraphs  in  Go  that 
satisfy  property  n  while  an  embedded  second-stage  BnB  is  used  to  determine  the  largest  possible  asso¬ 
ciated  IT  subgraphs  in  G\, . . .  ,Gn  that  can  be  supported  within  the  repair  budget  after  changes  to  the 
original  graph  Go  occur.  Both  algorithms  work  by  navigating  between  levels  of  the  respective  BnB  trees 
until  the  subgraphs  of  Go  and  G^,  k  —  1 . N ,  that  maximize  the  objective  of  (5)— (6)  are  found. 

For  convenience  of  notation,  it  is  assumed  that  So  and  S^,  k  e  A f ,  represent  feasible  solutions  (sub¬ 
graphs)  during  all  but  the  last  iterations  of  the  respective  BnB  algorithms,  upon  which  they  coincides 
with  the  optimal  solution(s)  to  problem  (6). 
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3.1  First  stage  branch-and-bound  algorithm 

The  first-stage  BnB  algorithm  begins  at  level  i  —  0  with  a  partial  solution  So  :=  0,  and  a  parti al  and 
global  lower  bounds  on  the  objective  value  of  problem  (6a),  Z  :=  — oo  and  Z*  :=  — oo,  respectively. 
Throughout  the  algorithm  the  partial  solution  So  contains  the  vertices  in  V  such  that  Go  [So]  satisfies 
property  IT. 

At  the  current  node  of  the  BnB  tree,  level  l  is  associated  with  the  candidate  set  Ci  of  vertices  from 
which  any  single  vertex  can  be  added  to  the  partial  solution  So  without  violating  property  IT.  Branching 
is  conducted  by  removing  a  branching  vertex  q  from  Q  and  adding  it  to  the  partial  solution  So-  The 
algorithm  is  initialized  with  Co  :=  V,  and  once  a  vertex  q  is  selected,  the  candidate  set  at  level  l  +  1  is 
constructed  by  eliminating  all  the  vertices  from  Q  whose  inclusion  in  So  would  violate  the  property  IT : 

Q+i :=  {'  e  Q  ;  Go[So u  *’]  3  n}-  (7) 

As  it  will  be  readily  seen  next,  the  operation  of  constructing  candidate  set  Ci+\  from  the  preceding 
candidate  set  Q  constitutes  one  of  the  basic  steps  of  the  algorithm,  and  the  computational  cost  of  this 
step  can  affect  significantly  the  computational  performance  of  the  solution  method.  In  this  regal'd,  a 
major  question  is  whether  one  can  efficiently  verify  property  n  for  any  given  subgraph.  The  associated 
decision  problem  is  a  follows:  given  a  subgraph  S,  determine  whether  S  satisfies  property  II ,  or  whether 
some  fraction  of  the  representation  of  S  can  be  modified  in  order  for  S  to  satisfy  property  IT.  In  the 
latter  case  it  is  said  that  S  is  e-far  from  satisfying  property  n,  where  e  corresponds  to  the  fraction  of 
modifications  that  need  to  be  made.  With  respect  to  hereditary  properties,  a  substantial  body  of  literature 
was  accumulated  in  recent  years  to  address  this  question.  For  example,  Alon  and  Shapira  [2]  showed  that 
every  hereditary  property  is  testable  with  one-sided  error.  Further,  several  characterizations  of  hereditary 
properties  have  been  proposed  [12].  As  described  above,  a  property  II  is  said  to  be  node-hereditary  if 
it  is  closed  under  taking  induced  subgraphs  of  G,  and  is  subgraph-hereditary  if  it  is  closed  under  taking 
subgraphs  of  G.  A  property  is  minor-hereditary  if  any  graph  minor 1  S  of  graph  G  satisfies  II.  In  a  series 
of  seminal  studies  [20,  21],  Robertson  and  Seymour  established  the  graph  minor  theorem  which,  among 
others,  predicated  polynomial  time  identification  of  hereditary  properties  closed  under  graph  minors. 

In  what  follows,  we  implicitly  assume  that  the  property  II  of  a  graph  can  be  tested  in  polynomial  time. 

Bounding 2  of  the  partial  subgraph  So  involves  determining  the  quality  of  the  solution  that  can  be  obtained 
by  further  exploring  the  vertices  in  C/;+ 1 .  Observe  that  the  most  opportune  realization  of  uncertainties  is 
such  that  the  structure  of  edges  sets  E^,  k  €  J\f,  would  preserve  the  property  II  of  So  in  each  graph  G/c; 
and  -  provided  that  sufficiently  many  favorable  edge  modifications  occur  -  the  budget  M  can  exclusively 
be  used  to  add  new  vertices  from  set  V  \  So  to  subgraph  So-  In  other  words,  under  “ideal”  circumstances 
a  second-stage  solution  of  size  min{|So|  +  M,\V\}  is  obtained  in  any  given  scenario  k  e  J\T.  For  a 
given  S  C  F(G),  let  un(G  [.S'])  represent  an  upper  bound  on  the  size  of  the  largest  possible  Il-subgraph 
contained  in  the  induced  graph  G[S],  where  subscript  II  indicates  that  the  properties  and  computation 
of  this  bound  depend  explicitly  on  II.  Then,  min  [un(Go[So  U  Q+ 1 ])  +  M,  \  V |  J  represents  an  upper 
bound  on  the  potential  contribution  of  the  recourse  action,  whence  the  left-hand-side  of  the  expression 

un(G0[So  U  Cl+l\)  +  min  {un(G0[So  U  Q+1])  +  M,  \V\}  <  Z*  (8) 

1 A  graph  S  is  a  minor  of  G  if  edge  contractions  can  be  performed  of  a  subgraph  of  G  to  obtain  S. 

2The  specific  mechanisms  of  both  branching  and  bounding  should  be  selected  according  to  the  subgraph  property  If  under 
consideration. 
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provides  a  “best-case”  objective  value  for  problem  (6a)  with  respect  to  the  current  partial  solution  So 
and  candidate  set  Q+1 .  Inequality  (8)  determines  whether  the  algorithm  branches  further  or  backtracks, 
namely,  if  (8)  is  violated,  the  algorithm  proceeds  to  solve  the  second-stage  recourse  problems  (6b)  for  all 
k  e  AT,  otherwise  the  algorithm  backtracks. 

In  general,  the  modified  sets  of  edges  E^,  k  e  A f,  will  not  preserve  the  property  II  of  So  as  described.  A 
significant  drawback  of  condition  (8)  is  therefore  its  disregard  for  structural  variations  between  Go  and 
Gfr,  particularly  relative  to  how  well  solution  So  and  the  vertices  in  candidate  set  Q+i  will  “perform” 
in  any  given  scenario  realization  k  e  J\f.  It  is  therefore  of  interest  to  introduce  several  feasibility  and 
reparability  conditions  towards  improving  condition  (8)  in  the  context  posed  by  the  following  question: 
given  a  current  first-stage  solution  So  and  corresponding  candidate  set  C^+i,  what  is  the  minimum 
number  of  modifications  that  must  be  made  to  So  in  any  second-stage  scenario  k  e  Af  in  order  to 
ascertain  property  II  ? 

One  possibility  is  to  perform  the  feasiblity  test  furnished  by  the  next  proposition  prior  to  solving  QifiS o) 
for  k  s  Af. 

Proposition  3  For  a  given  scenario  k  e  Af,  let  represent  a  subset  of  So  that  induces  a  II -subgraph 
in  G^[So]-  If  the  following  condition  is  satisfied, 

I S0 1  -  max  [\S(0k)\  :  G*[S^}]  3  >  M,  (9) 

S^cSo  '  ’ 

then  subgraph  So  is  an  infeasible  (irreparable)  first-stage  solution  to  problem  (5) — (6). 

Proof:  Recall  that  the  induced  subgraph  Go  [So]  has  property  II  by  construction.  Clearly,  since  the 

(k) 

vertices  remain  fixed  between  the  decision  stages,  the  largest  possible  set  of  vertices  Sq  such  that 
Gk\S^\  B  II  is  no  larger  than  | So |  (i.e.,  S$k)  C  So).  Hence,  the  left-hand-side  of  expression  (9) 
represents  the  smallest  number  Af  of  vertices  that  must  be  removed  from  So  in  order  to  obtain  a  subset 

S(k)  that  induces  a  subgraph  G/c [S^1]  with  property  n  under  scenario  k  e  Af.  This  immediately 
implies  that  if  condition  (9)  holds  for  any  k  e  Af,  the  budget  constraint  in  (6b)  cannot  be  satisfied.  □ 

Finding  the  maximum  subset  s{k >  that  induces  a  Il-subgraph  in  G^  [So]  by  solving  a  problem  of  type  (1) 
for  each  scenario  k  e  Af  in  expression  (9)  is  clearly  computationally  infeasible.  Instead,  we  utilize  the 
fact  that  |  So  |  >  un(G^[S0])  >  |sf}|,  and  employ  a  more  efficient  condition  by  replacing  the  second 
term  in  expression  (9)  by  un(Gyt[So]), 

\So\-vn(Gk[S0})>  M.  (10) 

Obviously,  condition  (9)  is  satisfied  whenever  (10)  holds.  Assuming  that  subgraph  So  is  deemed  feasible 
under  the  current  assumptions,  the  left-hand- side  of  (10)  represents  an  approximation  of  the  minimum 
number  of  vertices  that  must  be  removed  from  So  under  scenario  k  €  Af. 

By  a  similar  argument,  it  is  possible  to  determine  the  number  of  vertices  that  will  have  to  be  removed 
from  subgraph  So  in  the  second  stage  if  a  vertex  i  e  C^+1  is  added  to  So  in  the  first  stage. 

Corollary  1  If  inequality  (9)  is  satisfied  in  scenario  k  e  Af,  then  vertex  i  €  can  be  removed  from 
Cg+ 1  if  the  condition 

| So  U  i\ -  max  {|sf°|  :  G^[S;W]  3  n|  >  M,  (11) 

sA^SoUi 
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holds  for  some  ft -subgraph  sjl<  ^  in  the  induced  subgraph  Gk  [.S'o  U  /]. 

An  analogous  approximation  to  that  of  (10)  is  then  obtained, 

|So  U  i\  —  on(G^[5o  U  /])  >  M.  (12) 

All  vertices  i  e  Q+1  that  satisfy  (12)  are  removed  prior  to  computing  un(Go[So  U  Q+t]);  the  resulting 
“refined”  candidate  set 

C'+1  :=  {i  €  Q+i  :  1 50  U  / 1  -  vn(Gk[S0  U  i])  <  M,  Vk  e  AA}, 

produces  a  more  conservative  estimate  un(Go[So  U  Q+i])  in  (8).  To  simplify  the  notation,  it  will 
hereafter  be  assumed  that  Q+1  denotes  the  refined  candidate  set  C^+1. 

In  the  case  when  inequality  (10)  is  violated  at  all  scenarios  k  €  A f,  then  prior  to  solving  problems 

Qk(S o)>  k  —  1 . /V,  the  following  bounding  condition  for  the  objective  value  of  problem  (6a)  is 

verified  at  the  current  node  of  the  BnB  tree: 

un(Go[SoUQ+1])+  X]  Pktnm{vu(Gk[S0U  Cl+l])  +  Mk,\V\}  <  Z* ,  (13) 

keAf 

where  Mk  =  M  —  (|5o|  —  T’n(Qfc[>S'o]))>  k  €  A f,  represent  reduced  budgets  obtained  from  (10). 

If  inequality  (13)  is  violated,  then  there  arc  two  possibilities  that  can  arise  with  respect  to  the  second- 
stage  problems  (6b).  First,  the  second-stage  problem  (6b)  may  be  infeasible  for  some  k,  given  the 
current  solution  .S'o .  Then,  the  corresponding  second-stage  function  Qk(So)  and  the  respective  recourse 
function  E^gtSo)]  =  Hkej\f  PkQk(So)  assume  value  of  — oo.  In  this  case,  vertex  q  is  removed  from 
So  and  the  next  branching  vertex  is  selected  from  the  candidate  set  if  Q  f  0.  An  illustration  of  such 
a  case  is  given  in  Figure  1.  Alternatively,  all  second  stage  problems  arc  feasible  and  functions  Qk  (.S'o), 
k  —  1, . . . ,  N ,  are  finite,  whence  the  current  objective  value  associated  with  problem  (6a)  is  updated  as 
Z  =  | So |  +  JZfceW  PkQkiS o);  the  global  lower  bound  Z*  is  replaced  by  Z  if  Z *  <  Z.  Then,  if  the 
candidate  set  is  non-empty,  Q+i  f  0,  the  algorithm  selects  a  branching  vertex  q  from  the  next  level 
l  +  1 .  The  branching  vertex  q  at  level  i  is  stored  as  qi  for  backtracking  purposes.  Alternatively,  if 
Q+i  =  0,  the  algorithm  backtracks  by  removing  vertex  q  from  So- 

Whenever  condition  (13)  is  satisfied,  there  is  no  possibility  of  achieving  an  improvement  over  the  global 
lower  bound  Z*  by  exploring  further  levels  of  the  BnB  tree;  vertex  q  is  removed  from  So-  If  Q  =  0,  the 
algorithm  backtracks  to  level  l—  1  by  removing  from  So  the  most  recent  branching  vertex  that  was  used 
at  level  i  —  1 ,  namely  vertex  | .  The  described  first-stage  BnB  procedure  is  formalized  in  Algorithm  1 . 

3.2  Second-stage  branch-and-bound  algorithm 

The  BnB  algorithm  for  solving  the  second-stage  problem  Qk(So),  k  e  Af,  identifies  the  largest  sub¬ 
graph  S^  C  V(Gk)  with  property  FI  that  satisfies  the  budget  constraint  (5c).  As  in  the  first-stage  BnB 
technique,  it  navigates  the  levels  of  the  (second-stage)  BnB  tree  by  exploring  branching  vertices  from 
candidate  sets  that  individually  satisfy  the  property  IT  with  respect  to  the  partial  solution  Sk.  The  bound¬ 
ing  procedure  of  the  second-stage  algorithm  pertains  to  eliminating  unfavorable  search  space  relative  to 
the  budgetary  restriction  M .  Namely,  the  subgraph  selected  in  the  second  stage  must  be  feasible  with 
respect  to  the  first-stage  partial  solution  So  in  the  sense  that  the  number  of  added  and  removed  vertices 
from  .S'o  in  scenario  k  do  not  exceed  the  budget  M . 
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G, 


<?i(S0)  =  4 
A+=  {e} 
A-=  {b} 


S0  =  {a,b,c,d} 


A  =  {b,c}  A  =  {a,b,c}  -»  |A  |  >  M 


Figure  1:  An  example  with  three  scenarios  demonstrating  the  reparability  of  subgraph  So  with  a  repair  budget  M  =  2  and 
property  IT  representing  completeness.  Black  vertices  represent  those  belonging  to  a  complete  subgraph.  Observe  that  solution 
So  is  feasible  (repairable)  with  respect  to  scenarios  a>\  and  10 2,  but  is  infeasible  (not  repairable)  with  respect  to  scenario  M3. 
Scenario  a>2  also  illustrates  that  the  subgraphs  in  the  first  or  second  second  stages  need  not  be  maximal. 


The  algorithm  begins  by  selecting  a  branching  vertex  q  from  the  candidate  set  Cy ;  initially  Cq  :=  V . 
Due  to  the  fact  that  adding  and  removing  vertices  from  So  imposes  a  budgetary  penalty,  the  natural 
tendency  is  to  maintain  as  similar  of  a  structure  as  possible  in  the  second  stage.  Noting  that  vertices 
common  to  Cf  and  the  solution  So  do  not  utilize  the  budget  M,  a  vertex  q  e  {So  D  Cn  is  always 
selected  first  if  {So  (~l  C]f}  ^  0.  Once  q  is  added  to  the  second-stage  partial  solution  S&,  the  candidate 
set  at  the  next  level  is  constructed  by  removing  all  the  vertices  from  Cjf  whose  inclusion  in  S; \ 
would  violate  property  n. 

Given  the  first-  and  second-stage  parti al  solutions  So  and  S^,  respectively,  the  left-hand-side  of  con¬ 
straint  (5c)  can  easily  be  computed  so  that  8  =  |  So  \  S&  |  +  |  S&  \  So  | .  Observe  that  the  number  of  vertices 
in  C^+ !  that  could  preserve  or  reduce  the  value  of  8  at  consecutive  levels  of  the  BnB  tree  is  given  by 
y  =  | So  D  C*+1 1.  Several  bounding  consideration  emerge  as  a  result. 

The  following  conditions  arc  possible  when  8  —  y  <  M: 

(Cl)  If  <5  <  M,  then  (5c)  is  satisfied  via  vertices  in  S^,  and  S/t  replaces  S^  if  |S^|  >  |S£|.  In  cases  when 
8  =  M  and  y  >  0,  a  branching  vertex  q  e  (So  D  C^+1}  is  selected  and  the  algorithm  branches  to 
level  l  l  +  \.  On  the  other  hand,  if  y  —  0,  adding  more  vertices  to  will  violate  (5c);  thus, 
the  algorithm  backtracks  by  removing  the  most  recent  branching  vertex  q  from  S/p  If  8  <  M  and 
C*+1  ^  0,  the  algorithm  always  branches. 

(C2)  If  8  >  M ,  the  partial  solution  S/t  is  infeasible  with  respect  to  (5c).  However,  the  set  {So  O  C^+1} 
necessarily  contains  a  sufficient  number  of  vertices  to  (potentially)  satisfy  M  at  deeper  levels  of 
the  BnB  tree,  i.e.,  y  >  8  —  M .  The  algorithm  branches  accordingly. 

In  cases  when  8  —  y  >  M,  restriction  (5c)  cannot  be  satisfied  by  exploring  the  vertices  in  C^+1,  and, 
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therefore,  the  algorithm  backtracks  as  before. 

Algorithm  2  outlines  the  described  solution  technique  for  the  second-stage  problem  Qk(So),  k  €  A f. 
Notice  that  enhancing  the  branching  and/or  bounding  scheme  is  possible  by  applying  structural  consid¬ 
erations  relative  to  property  IT.  However,  in  an  effort  to  maintain  a  purely  budgetary-based  solution 
procedure  that  is  independent  of  graph-structural  properties,  this  notion  is  reserved  for  future  investiga¬ 
tions. 

Algorithm  2:  Second  stage  combinatorial  BnB  method  for  computing  Qk(So) 

1  Input:  G/c;  So ; 

2  Initialize:  l  :=  0;  :=  V ;  Sk  :=  0;  S*  :=  0; 

3  while  i  >  0  do 

4  if  C]f  ^  0  then 

s  if  I  So  nC^I/0  then 

6  |  select  a  vertex  q  e  {So  n  CM; 

7  else 

8  select  a  vertex  q  e  C*; 

9  C*  :=  Cjf  \  q\ 

to  Sk  :=  Sk  U  q\ 

n  C*+ 1  :=  {/  e  Cjf  :  Gk[i  U  Sk]  satisfies  II}; 

12  8:=\S0\Sk\  +  \Sk\So\-, 

13  y  |S0  n  C]f+l\; 

14  if  8  —  y  <  M  then 

is  if  8  =  M  and  y  =  0  then 

16  if  \Sk\  >  |S£|  then 

”  L  Si  :=  % 

18  Sk  :  =  Sk  \  q; 

19  else 

20  qt  :=  q; 

21  l  :=  t  +  1; 

22  if  <5  <  M  and  \Sk\  >  |S£|  then 

23  L  Sk  '■=  Sk< 

24  else 

25  |_  Sk  ■  =  Sk  \  q  , 

26  else 

27  Sk  ■—  Sk  \  qi—h 

28  l  i  —  1; 

29  return  Qk(S0)  :=  \S*\: 
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4  A  mathematical  programming  formulation  of  the  TSMS  problem 


A  mathematical  programming  formulation  of  the  maximum  subgraph  problem  (1)  can  be  obtained  by, 
for  example,  defining  a  binary  vector  x  e  {0,  1}^  that  indicates  whether  vertex  i  e  V  belongs  to  the 
sought  subset  S  (i.e.,  x,  =  1  if  /  €  S  and  x,  =  0  otherwise),  and  expressing  the  property  n  in  the  form 
of  “structural”  constraints  YIq  (x)  <  0,  such  that  these  constraints  are  satisfied  for  a  given  x  if  and  only 
if  G[S]  satisfies  fl,  where  S  =  {/  €  V  :  Xj  =  1}: 


max  jlTx  :  nG(x)  <  0,  x  €  {0,  1}|F|[.  (14) 

Here  1  denotes  the  vector  of  ones  of  an  appropriate  dimension.  The  corresponding  0- 1  integer  program¬ 
ming  formulation  of  TSMS  problem  (5)  then  takes  the  form 

max  lTx  +  E  PklTyk  (15a) 

kej\f 

s.  t.  nGo(x)  <  0  (15b) 

nG*(y*)  <  0,  WkeAf  (15c) 

l|x  —  yjtlli  <  M,  Vk  e  J\T  (15d) 

x,y^  e  {0, 1}|F|,  VkeAf,  (15e) 


where  the  vector  x  denotes  the  first-stage  decision  variables,  and  the  second-stage  variables  y&  are  defined 
for  any  fixed  k  €  Af  as  y^(  =  1  if  i  e  and  y^,  =  0  otherwise.  Constraints  (15d)  impose  the 
previously  described  budgetary  restrictions.  In  correspondence  to  (6),  the  above  extensive  formulation 
of  the  TSMS  problem  can  be  equivalently  presented  in  recourse  form: 


max  lTx  +  E  PkQki*)  (16a) 

keAf 

s.  t.  nGo(x)  <  0  (16b) 

x  €  {0,  1}|F|,  (16c) 

where  the  second-stage  function  is  given  by 

Qk(x)  =  max  lTy^  (17a) 

s.  t.  nG,(yfc)<0  (17b) 

II*  —  yjt  II  i<M  (17c) 

y^e{0,l}|F|.  (17d) 


We  next  consider  a  particular  instance  of  the  TSMS  problem  when  the  property  II  defines  a  clique. 


5  Two-stage  stochastic  maximum  clique  problem 

As  an  illustrative  example  of  the  general  TSMS  problem  and  the  proposed  solution  approaches,  in  this 
section  we  consider  the  two-stage  maximum  clique  problem,  a  special  case  of  the  TSMS  problem  (5) 
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when  the  property  n  represents  completeness.  Then,  the  graph-theoretical  formulation  of  the  two-stage 
stochastic  maximum  clique  problem  takes  the  form 


max  | So |  +  ^  Pk\Sk\  (18a) 

keM 

s.  t.  {Sk  c  V  :  Vi,  j  e  Sk,  (i,  j )  e  Ek),  Wk  e  {0}  u  Af  (18b) 

\S0\Sk\  +  \Sk\S0\<M,  VkeAf  (18c) 

Sk  c  V,  Vk  e  {0}  U  Af.  (18d) 


The  corresponding  mathematical  programming  formulation  that  we  use  in  this  work  employs  the  well- 
known  edge  formulation  [18]  of  the  structural  constraints  that  guarantee  completeness  of  the  selected 
subgraph,  namely 

{z  e  {0,  1}|F|  :  IIg(z)  <  0}  =  {z  e  {0,  1}|K|  :  Zi  +  Zj  <  1  for  all  (/',  j)  e  E}, 

where  E  represents  the  set  of  edges  of  the  complement  of  graph  G,  i.e.,  (i,  j)  €  E  (i,  j)  £  E  for 
any  i,  j  e  V .  Then,  the  two-stage  stochastic  maximum  clique  problem  admits  the  following  0-1  integer 
programming  from: 


max  Y2  Xi  +  ^  pk 

(19a) 

ieV  keM 

\ieV  / 

S.  t.  Xj  +  Xj  <  1, 

V(iJ)eE 

(19b) 

Sik  +  yjk  <  1, 

V (i ,  j )  €  ~Ek,  keM 

(19c) 

1  xi  ~  Sik  1  < 

M,  Vk  e  A f 

(19d) 

ieV 

xi ,  Vik  G  {0,  1}, 

V;  e  V,  k  e  Af. 

(19e) 

Formulation  (19)  can  be  solved  with  appropriate  integer  programming  solvers. 

The  property-specific  techniques  for  finding  cliques  in  all  types  of  graphs  via  Algorithms  1-2  arc  de¬ 
scribed  next. 

5.1  Candidate  set  generation,  branching  and  bounding  techniques 

When  property  IT  defines  a  clique,  a  number  of  efficient  techniques  has  been  developed  in  literature  that 
can  be  utilized  for  candidate  set  generation,  branching,  and  bounding.  For  example,  the  candidate  sets 
can  be  efficiently  generated  and  updated  an  intersection  of  neighboring  vertices  common  to  the  clique 
elements.  Constructing  candidate  set  (7)  is  performed  by  pairwise  testing  any  vertex  j  e  So  against  a 
vertex  i  e  Q,  and  removing  the  vertices  from  Q  that  arc  not  adjacent  to  subgraph  So,  i.e., 

Q+i  :=  {i  eCe:(i,j)e  E0,Vj  e  S0}. 

A  refinement  criterion  with  respect  to  the  second-stage  graph  scenarios  as  described  by  Corollary  1  is 
furnished  by  the  next  proposition. 
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Proposition  4  Given  a  scenario  k  e  Af  and  a  vertex  i  €  Q+i,  let  V k ( i )  :=  {_/  e  So  :  (i .  j)  e  Ek  | 
represent  the  (sub)set  of  vertices  such  that  any  two  vertices  i,  j  are  adjacent  in  Gk  [.S’o  U  /].  If  the 
following  inequality  holds, 

I -S’o  |  -  |I\(z)|  >  M,  (20) 

then  vertex  i  can  be  removed  from  Q+j. 

Proof:  If  i  is  added  to  So  in  the  first  stage,  then  it  is  easy  to  see  that  the  vertices  So  \  Ek(i)  must  be 
removed  from  So  in  order  for  Gk\Tk  (i )  U  /]  to  (possibly)  form  a  complete  graph  in  scenario  k  e  A f. 
Note  that  if  subgraph  Gk\Tk  (i )]  is  not  a  clique  in  k  e  AT,  then  at  least  one  vertex  from  the  set  So  \  T/c  (i ) 
must  be  further  removed  from  So  in  the  first  stage.  Thus,  |  T /,  ( / )  U  i  |  provides  an  upper  bound  on  the 
size  of  the  maximum  clique  contained  in  G k  [So  U  /].  Consequently,  expression  (20)  approximates  the 
minimum  number  of  vertices  that  must  be  removed  from  So  in  the  second  stage  if  vertex  i  is  included, 
which  cannot  exceed  the  budget  M .  □ 

In  this  study,  we  consider  two  techniques  for  computing  the  upper  bound  un  (•)  on  the  size  of  maximum 
clique  and  for  selecting  a  branching  vertex  q  e  Ci  when  property  II  represents  a  clique.  We  emphasize 
that  proper  selection  of  branching  and  bounding  mechanisms  according  a  graph's  structural  character¬ 
istics  and  the  sought  property  II  does  heavily  influence  the  computational  performance  of  the  solution 
method  described  in  Algorithm  1 . 

5.1.1  An  approximate  coloring  algorithm 

The  first  technique  utilizes  principles  introduced  by  Tomita  et  al.  [23]  to  estimate  the  size  of  the  maximum 
clique  contained  in  G[S],  S  C  V,  by  partitioning  S  into  independent  sets,  also  know  as  numbering  or 
coloring  classes.  The  vertices  in  S  arc  first  sorted  in  degree  descending  order,  and  a  minimum  positive 
integer  m  is  assigned  to  each  vertex  i  e  S  such  that  72/  f  ri  j  if  the  pair  i ,  j  e  S  arc  connected  by  an 
edge  (i,  j)  €  E{G).  Consequently,  vertices  associated  with  a  number  class  nk  (i.e.,  vertices  with  the 
same  assigned  integer  value)  form  an  independent  set. 

Since  that  the  size  of  any  clique  embodied  in  G  [.S']  cannot  exceed  the  number  of  coloring  classes  gener¬ 
ated  from  S,  one  immediately  obtains  a  bound  on  the  maximum  clique  size  as 

t»n(G[5])  =  max{/7/  :  i  €  S}. 

We  use  this  expression  in  Algorithm  1  to  obtain  the  bounds  un(G^[5o])  and  un (G/c [So  U  Q+i]), 
k  e  {0}  U  Af.  Condition  (13)  then  takes  the  form: 

I  S'o  |  +  max{/2;-  :  i  e  C£+l}  +  ^  pk  min  j  max]/?/  :  i  e  S0  U  Ci+l}k  +  Mk,  |F||  <  Z* .  (21) 

keAf 

The  branching  rule  used  in  connection  with  the  described  approximate  coloring  scheme  is  as  follows: 
select  a  vertex  q  e  Q  with  the  maximum  number  nq  :=  max{/;/  :  i  e  Q}.  Note  that  an  initial  coloring 
of  set  Co  :=  I7  is  performed  prior  to  Step  2. 

5.1.2  Directed  acyclical  path  decomposition 

Yamguchi  and  Masuda  [29]  proposed  a  clever  technique  for  finding  maximum  weighted  cliques  in  graphs 
by  transforming  G[S],  S  c  V,  into  a  directed  acyclical  graph  G  [.S']  such  that  the  lengths  of  the  resulting 
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acyclical  paths  represent  bounds  on  the  size  of  the  maximum  clique  in  G  [.S'].  The  method  proceeds  as 
follows.  Without  loss  of  generality,  let  each  vertex  i  e  S  he  associated  with  a  unit  weight  Wi  =  1,  and 
define  set  U(S)  :=  {uj  :  V/  e  5},  where  each  element  w;  is  initially  equivalent  to  Wi.  Then,  the  set 
U(S )  is  updated  by  sequentially  “ propagating ”  the  elements  w/,  V/  e  S,  onto  adjacent  members  in  S. 
Particularly,  during  each  iteration  a  vertex  i  that  corresponds  to  the  minimum  argument  w,  in  set  U(S)  is 
selected,  and  u,  is  propagated  by  adding  it  to  the  weights  of  vertices  j  €  S  adjacent  to  vertex  i  in  graph 
G[*S].  The  elements  adjacent  Uj  arc  updated  arc  updated  as 


Uj  +  wj ,  if  Uj  <  Ui  +  Wj, 
Uj ,  otherwise. 


for  all  j  e  {j  :  (i,  j)  e  E,  i,  j  e  S}. 


(22) 


Once  a  vertex  i  e  S  has  been  processed,  w;  is  fixed  and  cannot  be  increased  in  subsequent  propagations 
from  other  (unprocessed)  adjacent  vertices  in  S.  The  updating  process  terminates  once  all  the  elements 
in  U(S )  have  been  fixed. 

Observe  that  sequentially  fixing  elements  w;  produces  a  directed  acyclical  graph  G  [.S'],  where,  once  all 
the  elements  in  U(S)  arc  fixed,  any  u,  e  U(S)  represents  the  longest  acyclical  path  in  G  [.S']  whose 
endpoint  is  the  vertex  i  e  S  (see  [29]  for  details).  Utilizing  the  fact  that  the  length3  of  longest  path 
in  G[S]  is  an  upper  bound  on  the  maximum  clique  size  in  G [.S'],  one  obtains  the  bounding  condition 
un  (G  [.S'])  =  maxjn,  e  U(S)\.  Expression  (13)  then  takes  the  form: 


|50|  +  max{n,'  e  U{Ci+ j)}  +  ^  Pk  min  |  ma x{m;  e  U(S0  U  Ci+l)}k  +  Mk,  |U||  <  Z* .  (23) 

keM 


In  this  case,  it  is  assumed  that  the  vertex  with  the  largest  propagated  weight  from  adjacent  vertices  has  a 
high  probability  of  being  a  part  of  the  maximum  clique.  As  a  result,  the  algorithm  branches  by  selecting 
the  vertex  q  £  Cj  that  corresponds  to  the  maximum  element  in  U(Cj). 


5.2  Numerical  experiments  and  results 

Numerical  experiments  demonstrating  the  performance  of  the  proposed  BnB  algorithms  for  solving  the 
TSMS  problem  when  property  n  represents  a  clique  were  conducted.  Problem  (19)  was  solved  for 
randomly  generated  Erdos-Renyi  graphs  of  orders  \  V\  =  25,  50,  75,  100  with  average  densities  of  d  = 
0.2, 0.5, 0.8.  The  number  N  of  second-stage  graph  scenarios  was  selected  as  N  =  25,50,75.  For 
any  given  graph  configuration,  the  number  of  vertices  |  V  \  and  densities  d  remained  fixed  during  both 
decision  stages.  The  value  of  constant  M  in  the  budget  constraints  was  fixed  at  M  =  [e|U|l,e  =  0.15 
throughout. 

The  combinatorial  first-  and  second-stage  BnB  algorithms  described  in  Section  3  were  coded  using  C++, 
and  CPLEX  12.5  integer  programming  (IP)  solver  was  used  for  solving  the  mathematical  programming 
formulation  (19)  of  the  two-stage  stochastic  maximum  clique  problem.  The  computations  were  ran  on 
an  Intel  Xeon  3.30GHz  PC  with  128GB  of  RAM,  and  version  12.5  of  the  CPLEX  solver  in  Windows  7 
64-bit  environment  was  used. 

The  combinatorial  BnB  method  defined  by  Algorithms  1  and  2  was  implemented  in  two  versions,  which 
use  the  branching  and  bounding  techniques  described  in  Sections  5.1.1  and  5.1.2,  and  which  arc  hence¬ 
forth  referred  to  as  “BnB  5.1.1”  and  “BnB  5.1.2”,  respectively.  The  computational  performance  of  both 

3  The  path  length  is  given  by  the  aggregate  weight  of  vertices  that  it  coincides  with. 
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valiants  of  Algorithm  1-2  was  compared  with  that  of  the  mathematical  programming  formulation  (19) 
as  solved  by  the  CPLEX  solver.  The  results  arc  reported  in  Tables  1,  2,  and  3,  where  columns  with  head¬ 
ings  “CPLEX”,  “BnB  5.1.1”,  and  “BnB  5.1.2”  contain  the  results  obtained  using  the  respective  methods. 
Ten  instances  of  each  problem/graph  configuration  were  generated  and  the  corresponding  solution  times 
and  objective  values  were  averaged  accordingly.  A  maximum  solution  time  limit  of  3600  seconds  was 
imposed  and  symbol  “ — ”  is  used  to  indicate  that  the  time  limit  was  exceeded  for  all  ten  instances  for  the 
given  graph  configuration.  If  only  a  portion  of  the  instances  were  solved  within  the  time  limit,  the  number 
of  instances  that  achieved  a  solution  and  their  corresponding  average  solution  times  are  presented. 

Table  1  summarizes  the  computational  times  for  graphs  with  average  edge  densities  of  d  —  0.2.  Observe 
that  both  BnB  algorithms  provide  improvement  in  running  time  of  at  least  three  orders  of  magnitude 
on  all  problem  configurations  in  comparison  to  the  CPLEX  IP  solver,  and  the  BnB  valiant  based  on 
acyclical  path  decomposition  produces  the  best  results.  It  must  be  noted,  however,  that  sparse  graphs  put 
the  mathematical  programming  formulation  (19)  the  two-stage  stochastic  maximum  clique  problem  at  a 
disadvantage,  since  the  employed  “edge  formulation”  of  clique  constraints  is  based  on  the  complement 
of  the  graph,  which  results  in  a  large  number  of  constraints  (19b)-(19c)  when  the  underlying  graph  is 
sparse.  At  the  same  time,  the  proposed  general  combinatorial  BnB  algorithm  performs  better  when  the 
depth,  i.e.,  the  number  of  “levels”  of  the  BnB  tree  is  smaller,  which  is  observed  on  sparse  graphs. 

Thus,  a  more  fair  comparison  of  the  combinatorial  and  mathematical  programming-based  schemes  can 
be  accomplished  when  one  considers  graphs  with  densities  close  to  d  =  0.5,  see  Table  2.  It  still  can 
be  observed,  though,  that  the  combinatorial  BnB  methods  drastically  outperform  the  mathematical  pro¬ 
gramming  formulation,  where  the  difference  is  especially  evident  on  instances  that  could  be  solved  by 
all  three  methods,  and  the  branching  and  bounding  rules  based  on  acyclical  path  decomposition  arc  still 
superior.  At  the  same  time,  graphs  of  density  d  =  0.5  present  a  greater  challenge  to  the  proposed  BnB 
method,  as  both  its  valiants  were  unable  to  solve  to  optimality  larger  problems  within  the  allowed  time 
limit.  Note  that  in  the  cases  when  all  three  methods  failed  to  find  an  optimal  solution  within  1  hour,  the 
BnB  methods  report  partial  solutions  with  higher  objective  value. 

Computational  results  for  two-stage  stochastic  maximum  clique  problem  on  graphs  with  average  densi¬ 
ties  of  d  =  0.8  arc  presented  in  Table  3.  At  these  densities,  the  combinatorial  BnB  methods  arc  generally 
inferior  to  the  mathematical  programming  formulation  (19),  which  can  be  explained  by  the  fact  that  the 
number  of  clique  constraints  (19b)-(19c)  is  relatively  small  for  dense  graphs,  making  problem  ( 19)  easier 
to  solve,  while  the  depth  of  the  BnB  tree  increases  with  the  density  of  the  graph,  which  leads  to  dete¬ 
riorated  BnB  solution  times.  On  the  other  hand,  it  can  be  seen  that  the  combinatorial  methods  arc  still 
preferable  when  \V\  =  25,  suggesting  that  the  proposed  algorithms  may  be  beneficial  for  dense  graphs 
when  the  number  of  scenarios  is  large  relative  the  number  of  vertices. 

In  the  case  when  d  =  0.8  and  \  V\  =  100,  both  BnB  algorithms  failed  to  generate  superior  objective 
values  in  comparison  to  CPLEX,  an  obvious  deviation  from  the  trend  of  preceding  results  for  instances 
of  the  same  density  and  \  V\  =  50,  75.  Empirical  evidence  suggests  that  the  majority  of  computational 
time  for  these  instances  was  spent  on  solving  the  second-stage  problems,  while  very  few  first-stage 
solution  were  processed.  This  observation  indicates  that  using  a  second-stage  branching  and  bounding 
criteria  solely  based  on  budgetary  restrictions  may  not  be  effective  for  very  dense  underlying  graphs, 
particularly,  once  a  certain  number  of  vertices  is  exceeded.  Although  the  present  study  aims  to  define  a 
budgetary-based  second-stage  BnB  approach,  it  is  likely  that  supplemental  graph-structural  techniques 
analogous  to  those  presented  in  Sections  5.1.1  -  5.1.2  would  produce  superior  results;  a  task  that  we 
reserve  for  future  research. 
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© 
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^3 

CPLEX 

BnB  5.1.1 

BnB  5.1.2 

|E| 

N 

# 

Time  (s) 
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# 

Time  (s) 

Obj 

# 

Time  (s) 

Obj 

25 

25 

10 

24.19 

6.29 

10 

0.02 

6.29 

10 

0.01 

6.29 

50 

10 

286.08 

6.16 

10 

0.04 

6.16 

10 

0.01 

6.16 

75 

6 

308.47 

6.14 

10 

0.04 

6.17 

10 

0.02 

6.17 

50 

25 

10 

279.91 

8.17 

10 

0.27 

8.17 

10 

0.17 

8.17 

50 

5 

1066.66 

8.09 

10 

0.60 

8.11 

10 

0.34 

8.11 

75 

0 

— 

8.07 

10 

0.90 

8.09 

10 

0.55 

8.09 

75 

25 

0 

— 

9.46 

10 

1.73 

9.47 

10 

1.22 

9.47 

50 

0 

— 

9.16 

10 

3.60 

9.20 

10 

2.40 

9.20 

75 

0 

— 

9.34 

10 

5.51 

9.49 

10 

4.93 

9.49 

100 

25 

0 

— 

9.96 

10 

7.90 

10.04 

10 

7.30 

10.04 

50 

0 

— 

9.88 

10 

16.09 

10.02 

10 

15.26 

10.02 

75 

0 

— 

9.32 

10 

23.09 

10.05 

10 

21.21 

10.05 

Table  1:  Average  solution  times  (in  seconds)  and  objective  values  for  problem  (19)  on  random  graphs  with  an  edge  density  of 
0.2  and  M  =  |"0.15|F|].  All  running  times  are  averaged  over  10  instances  and  symbol  “ — ”  indicates  that  the  time  limit  of 
3600  seconds  was  exceeded.  Columns  corresponding  to  symbol  “#”  provide  the  number  of  instances  solved  within  the  time 
limit. 
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9.75 

10 
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10 
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75 

7 
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10 
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10 
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50 

25 
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— 

13.39 

10 
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10 

27.31 

13.43 

50 

0 

— 

14.07 

10 

74.53 

14.11 

10 

44.07 

14.11 

75 

0 

— 

13.46 

10 
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13.72 

10 

64.64 
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75 

25 

0 

— 

15.84 

10 

2631.96 
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10 
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50 

0 

— 

15.74 

0 

— 

16.28 
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— 
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75 
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— 
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0 

— 

16.14 

0 

— 
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25 

0 

— 

17.13 

0 

— 
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0 

— 
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50 

0 

— 

16.61 

0 

— 

17.07 

0 

— 

17.65 

75 

0 

— 

15.92 

0 

— 

17.04 

0 

— 

17.86 

Table  2:  Average  solution  times  (in  seconds)  and  objective  values  for  problem  (19)  on  random  graphs  with  an  edge  density  of 
0.5  and  M  =  |"0.15|F|].  All  running  times  are  averaged  over  10  instances  and  symbol  “■ — ”  indicates  that  the  time  limit  of 
3600  seconds  was  exceeded.  Columns  corresponding  to  symbol  “#”  provide  the  number  of  instances  solved  within  the  time 
limit. 


6  Conclusions 

We  have  introduced  a  new  class  of  two-stage  stochastic  maximum  subgraph  problems  for  finding  the 
maximum  expected  size  of  a  graph  that  satisfies  a  defined  structural  property  IT.  Emphasis  was  put  on 
identifying  subgraphs  whose  properties  can  be  restored  within  a  limited  repair  budget  in  the  presence 
of  structural  uncertainties  that  manifest  in  the  form  of  random  connection  (edge)  changes/failures.  A 
combinatorial  BnB  algorithm  exploiting  the  structure  of  two-stage  stochastic  maximum  IT  subgraph 
problems  was  developed.  Our  technique  utilizes  two  combinatorial  BnB  algorithms  for  finding  optimal 
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3.35 

15.87 

10 

1.38 

15.87 

50 

10 

14.40 

15.79 

10 

11.44 

15.79 

10 

5.43 

15.79 

75 

10 

51.61 

14.96 

10 

12.57 

14.96 

10 

7.22 

14.96 

50 

25 

10 

1078.03 

23.97 

0 

— 

23.25 

0 

— 

23.69 

50 

6 

3125.09 

23.43 

0 

— 

23.05 

0 

— 

23.33 

75 

0 

— 

22.50 

0 

— 

22.19 

0 

— 

22.85 

75 

25 

0 

— 

28.07 

0 

— 

26.57 

0 

— 

28.29 

50 

0 

— 

27.57 

0 

— 

27.14 

0 

— 

28.31 

75 

0 

— 

27.19 

0 

— 

26.24 

0 

— 

27.50 

100 

25 

0 

— 

30.94 

0 

— 

18.49 

0 

— 

18.44 

50 

0 

— 

30.54 

0 

— 

18.41 

0 

— 

18.39 

75 

0 

— 

30.14 

0 

— 

18.35 

0 

— 

18.27 

Table  3:  Average  solution  times  (in  seconds)  and  objective  values  for  problem  (19)  on  random  graphs  with  an  edge  density  of 
0.8  and  M  =  |"0.15|T|].  All  running  times  are  averaged  over  10  instances  and  symbol  “■ — ”  indicates  that  the  time  limit  of 
3600  seconds  was  exceeded.  Columns  corresponding  to  symbol  “#”  provide  the  number  of  instances  solved  within  the  time 
limit. 


first-  and  second-stage  subgraph  solutions. 

The  proposed  framework  applies  to  a  broad  range  of  graph  properties,  and  in  this  work  we  illustrated 
the  proposed  approach  on  an  example  when  the  property  of  interest  fl  defines  a  clique.  Numerical 
simulations  on  randomly  generated  graphs  indicate  that  solution  times  can  be  reduced  by  several  orders 
of  magnitude  via  the  proposed  BnB  algorithm  in  comparison  to  an  equivalent  mathematical  programming 
solver.  Namely,  for  all  the  tested  graph  configurations  other  than  ones  with  high  edge  density  of  d  =  0.8, 
one  or  more  orders  of  magnitude  in  performance  improvements  were  observed. 
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Abstract 

In  this  work,  we  study  the  problem  of  detecting  risk-averse  low-diameter  clusters  in  graphs.  It 
is  assumed  that  the  clusters  represent  k  -clubs  and  that  uncertain  information  manifests  itself  in  the 
form  of  stochastic  vertex  weights  whose  joint  distribution  is  known.  The  goal  is  to  find  a  A' -club 
of  minimum  risk  contained  in  the  graph.  A  stochastic  programming  framework  that  is  based  on  the 
formalism  of  coherent  risk  measures  is  used  to  quantify  the  risk  of  a  cluster.  We  show  that  the  selected 
representation  of  risk  guarantees  that  the  optimal  subgraphs  are  maximal  clusters.  A  combinatorial 
branch-and-bound  algorithm  is  proposed  and  its  computational  performance  is  compared  with  an 
equivalent  mathematical  programming  approach  for  instances  with  k  =  2,3,  and  4. 

Keywords:  fc-club;  low-diameter  clusters;  stochastic  graphs;  coherent  risk  measures;  combinatorial 
branch-and-bound 
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1  Introduction 


Graphs  arc  effective  tools  for  modeling  many  real-world  systems  and  the  complex  interactions  between 
their  components.  A  typical  graph  model  assigns  vertices  to  represent  a  system's  components  and  a 
set  of  edges  to  describe  the  connections  and/or  relationships  between  them.  Well-known  examples  of 
such  frameworks  are  represented  by  many  systems  studied  in  social  network  analysis,  transportation, 
telecommunications,  computational  finance,  and  so  on.  Additionally,  graph-based  data  mining  meth¬ 
ods  [18]  provide  powerful  techniques  for  analyzing  and  understanding  systems  whose  descriptive  data 
may  be  represented  using  a  graph. 

A  principal  application  of  graph-based  data  mining  involves  the  identification  of  subgraphs,  referred 
to  as  clusters,  corresponding  to  subsystems  with  a  given  structural  or  functional  property.  For  example, 
in  social  networks,  detecting  highly-connected  clusters  can  be  used  for  advertising  and  marketing  pur¬ 
poses  [22,  23,  46];  in  stock  market  graphs,  it  can  be  used  for  identifying  diverse  portfolios  [12];  and  in 
call  graphs,  it  can  be  used  for  detecting  communicating  clusters  [1], 

One  of  the  basic  problems  in  this  context  entails  finding  the  largest  “perfectly”  cohesive  group  within 
a  network  such  that  the  confined  members  are  all  interconnected,  also  known  as  the  maximum  clique 
(complete  subgraph)  problem.  Several  prominent  studies  provided  the  basis  for  exact  combinatorial 
solution  algorithms  for  the  maximum  clique  problem  [8,  16,  33].  In  particular,  Carraghan  and  Pardalos 
[16]  introduced  a  recursive  branch-and-bound  method  for  finding  maximum  cliques  by  exploiting  the 
heredity  property  [42]  of  complete  subgraphs.  Subsequent  extensions  of  their  work  enhanced  the  process 
of  reducing  solution  space  via  vertex  coloring  schemes  for  estimation  of  upper  bounds  on  the  maximum 
achievable  subgraph  sizes  during  the  branch-and-bound  procedure  (e.g.  [15,  24,  41]). 

In  many  practical  applications,  the  requirement  that  the  desired  subgraph  must  be  complete  may, 
however,  impose  excessive  restrictions  and  therefore  warrant  some  structural  relaxation  in  terms  of  clus¬ 
ter  connectivity.  As  a  consequence,  a  number  of  clique  relaxation  models  have  been  proposed  in  graph 
theory  literature,  which  relax  the  completeness  property  relative  to  the  degree  of  the  member  vertices, 
their  distance  from  each  other,  or  the  density  of  the  subgraph.  A  comprehensive  review  of  clique  relax¬ 
ation  models  is  provided  in  [10].  In  the  present  work,  we  focus  on  a  specific  type  of  clique  relaxation, 
known  as  the  k-club  [3],  which  represents  a  subgraph  whose  members  arc  connected  via  at  most  k  —  1 
intermediary  members.  The  k-club  model  effectively  represents  low-diameter  clusters  that  may  reveal 
valuable  information  embedded  in  social,  financial,  and  telecommunication  networks.  Several  recent 
studies  proposed  combinatorial  branch-and-bound  methods  and  presented  complexity  results  associated 
with  finding  maximum  k -clubs  in  graphs  [13,  17,  34,  39]. 

An  important  extension  of  the  described  class  of  problems  involves  the  imposition  of  topologically 
exogenous  information  in  the  form  of  deterministic  vertex  weights,  and  correspondingly  finding  a  subset 
of  maximum  weight  that  conforms  to  a  defined  structural  property.  Similar  exact  weight-based  branch- 
and-bound  solution  techniques  have  been  developed  for  determining  the  maximum- weight  subgraphs  [7, 
28,  32], 

Numerous  circumstances  may  further  justify  the  imposition  of  uncertain  exogenous  information  over 
the  graph’s  edges  that  influences  network  flow  distribution,  robustness,  and  costs  [4,  6,  14,  20,  21,  44], 
However,  far  fewer  endeavors  consider  decision  making  relative  to  the  optimal  allocation  of  resources 
over  defined  subgraph  topologies  when  uncertainties  are  induced  by  stochastic  factors  associated  with 
network  vertices  [38].  For  example,  in  social  networks  or  call  graphs,  the  uncertainties  related  to  the 
value  or  reliability  of  the  information  provided  by  each  entity  can  be  modeled  by  random  weights  on  ver¬ 
tices  whose  relationships  or  communications  are  presented  by  edges.  Similarly,  in  stock  market  graphs, 
the  uncertainties  associated  with  returns  on  investments  from  different  assets  can  be  defined  as  random 
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weights  assigned  to  their  corresponding  vertices,  with  edges  linking  highly  correlated  assets  (vertices). 

In  this  study,  we  extend  the  techniques  introduced  in  [38]  to  address  problems  of  finding  subgraphs  of 
minimum  risk  that  represent  a  A: -club.  A  probabilistic  framework  utilizing  the  distributional  information 
of  stochastic  vertex  weights  by  means  of  coherent  measures  of  risk  [5,  19]  is  employed  to  define  a 
risk-averse  k-club  (RA-k)  problem  as  finding  the  lowest  risk  k-club  in  a  network.  As  an  illustrative 
example,  we  focus  on  instances  when  k  =  2,3,4,  and  utilize  a  mathematical  programming  formulation 
introduced  in  [43]  for  finding  a  maximum  k-club  in  a  graph.  A  combinatorial  branch-and-bound  method 
for  finding  a  largest  k-club  [13,  17,  34]  is  also  modified  to  accommodate  the  conditions  of  RA-k  problem 
via  risk-based  branching  and  bounding  schema.  We  compare  the  solution  performance  of  the  proposed 
branch-and-bound  algorithm  relative  to  solving  the  mathematical  programming  formulation  for  the  RA-k 
problem  using  a  state-of-the-art  commercial  solver. 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section  2,  we  examine  the  general  representa¬ 
tion  of  RA-k  problem  and  discuss  its  properties.  Section  3  presents  a  mathematical  programming  formu¬ 
lation  and  a  combinatorial  branch-and-bound  method  for  solving  the  RA-k  problem.  Finally,  Section  4 
furnishes  numerical  studies  demonstrating  the  computational  performances  of  the  developed  branch-and- 
bound  method  and  the  aforementioned  mathematical  programming  approach  on  problems  where  risk  is 
quantified  using  higher-moment  coherent  risk  measures  [27]. 

2  Risk-averse  /c-club  problem 

Given  an  undirected  graph  G  =  ( V,  E )  and  any  subset  of  its  vertices  .S'  C  V,  let  G  [.S']  represent  the 
subgraph  of  G  induced  by  S,  i.e.,  G[S]  =  (S,  E  n  (S  x  S’)).  Let  Q  denote  the  desired  property 
which  the  induced  graph  G  [.S']  must  satisfy.  The  present  work  considers  the  case  when  Q  represents  a 
certain  relaxation  of  the  completeness  property,  such  that  a  subgraph  with  property  Q  represents  a  clique 
relaxation. 

Depending  on  the  characteristic  of  a  complete  graph  that  is  relaxed,  clique  relaxations  can  be  cate¬ 
gorized  into  density-based  [1,  2,  35],  degree-based  [40],  and  distance-based  [3,  29,  30]  relaxations.  In 
this  work,  property  Q  represents  a  special  distance-based  relaxation  of  the  completeness  property.  For  a 
formal  definition,  let  dG(i,  j  )  denote  the  distance  between  nodes  i.  j  e  F  in  graph  G,  measured  as  the 
number  of  edges  in  a  shortest  path  between  i  and  j  in  G.  Then,  a  subset  of  vertices  S  C  V  of  graph  G 
is  called  a  k-clique  if 

max  dG(i,  j)  <  k. 
i,jeS 

Note  that  the  definition  of  the  k-clique  does  not  require  that  the  shortest  path  between  i,  j  e  S  belong 
to  G[S].  If  one  requires  that  the  shortest  path  between  any  two  vertices  i.  j  in  S  belong  to  the  induced 
subgraph  G[S],  then  the  subset  S  such  that 

max  dG[S](i,j)<k,  (1) 

i,jeS 

is  called  a  k-club.  Note  that  a  k-club  is  also  a  k-clique,  while  the  inverse  is  not  true  in  general.  By 
definition,  1 -cliques  and  1 -clubs  are  cliques.  Throughout  the  remainder  of  this  study,  we  let  rG(k) 
denote  the  set  of  all  k-clubs  in  graph  G: 

rG(k)  =  {SCF  dG[s](iJ)  <  k,  V/J  e  5}.  (2) 

Additionally,  a  k-club  is  said  to  be  maximal,  if  it  is  not  strictly  contained  in  another  k-club;  and  a 
maximum  k-club  is  a  k-club  of  the  largest  order  in  graph  G. 
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A  popular  class  of  graph-theoretical  problems  is  represented  by  the  maximum  weight  subgraph  prob¬ 
lems,  which  arc  concerned  with  finding  a  subset  S  of  vertices  in  G  such  that  the  induced  subgraph 
satisfies  the  given  property  Q  and  has  the  largest  weight  (defined  as  the  sum  of  its  vertices’  weights). 
The  maximum  weight  k-club  problem  is  then  formulated  as 

max  {X>:S  e  rG(fc)},  0) 

ieS 

where  w,  >  0  represents  the  weight  of  vertex  i  and  the  set  r q  (k )  is  defined  by  (2).  Clearly,  an  optimal 
set  S  in  problem  (3)  will  be  maximal ,  but  not  necessarily  maximum  (of  the  largest  order)  set  with  property 
Q.  If  the  weight  of  each  vertex  is  one,  the  maximum  weighted  A -club  problem  is  simply  referred  to  as 
the  maximum  A' -club  problem. 

In  this  work,  we  consider  an  extension  of  problem  (3)  that  assumes  stochastic  vertex  weights.  In  this 
case,  a  direct  translation  into  a  stochastic  framework  is  not  straightforward  due  to  the  fact  that  the  maxi¬ 
mization  of  random  weights  would  be  ill-posed  in  context  of  stochastic  programming  resulting  from  the 
absence  of  a  deterministic  optimal  solution.  Likewise,  maximization  of  the  expected  weight  of  the  sought 
set  is  rather  uninteresting  in  the  sense  that  it  reduces  to  the  deterministic  version  of  the  problem  presented 
above.  A  more  suitable  approach,  thus,  involves  computing  the  subgraph’s  weight  via  a  (nonlinear)  sta¬ 
tistical  function  that  utilizes  the  distributional  information  about  the  weights’  uncertainties,  rather  than  a 
simple  sum  of  its  vertices’  stochastic  weights.  In  particular,  we  pursue  a  risk-averse  approach  so  as  to 
find  the  subgraph  of  G  that  has  the  lowest  risk  and  satisfies  property  Q.  Let  Xl  denote  a  random  variable 
that  represents  the  costs  of  losses  associated  with  vertex  i  e  V,  such  that  the  joint  distribution  of  vector 

Xq  —  (Xi . X|j/|)  is  known.  Then,  the  problem  of  finding  the  minimum  risk  subgraph  in  G  that  has 

property  Q,  or  the  risk-averse  Q  problem  takes  the  form: 

min  Xq)  :  S  C  V  and  G[S]  satisfies  Q},  (4) 

where  Xq)  is  the  risk  associated  with  set  S  given  the  distributional  information  Xq.  In  the  partic¬ 
ular-  case  when  property  Q  ensures  that  the  subgraph  in  question  is  a  A -club,  formulation  (4)  defines  the 
risk-averse  k-club  problem  (RA-A), 

min  {M(S ;  XG)  :  S  e  Tq  ( k ) } ,  (5) 

which  represents  a  risk-averse  stochastic  generalization  of  the  deterministic  maximum  weight  A-club 
problem  (1),  as  shown  below. 

A  constructive  form  of  risk  function  Xq)  can  be  introduced  by  employing  the  well-known  in 
stochastic  optimization  literature  concept  of  risk  measure  [26].  Given  a  probability  space  (G .  J7.  P), 
where  Q.  is  the  set  of  random  events,  T  is  the  o-algebra,  and  P  is  a  probability  measure,  a  risk  measure 
p  is  defined  as  a  mapping  p  :  X  i->  R,  where  X  is  a  linear  space  of  X-measurable  functions  X  : 
Q,  i->  R.  In  what  follows,  the  space  X  is  assumed  to  possess  the  properties  necessary  for  the  risk 
measures  introduced  below  to  be  well-defined.  Namely,  X  is  supposed  to  allow  for  a  sufficient  degree  of 
integrability,  in  particular,  E|X|  <  oo,  and  be  endowed  with  an  appropriate  topology,  e.g.,  the  topology 
induced  by  convergence  in  probability.  Lastly,  we  consider  risk  measures  that  are  proper  functions  on 
X,  i.e.,  p(X)  >  -oo  for  all  X  €  X  and  {X  €  X  :  p(X)  <  oo}  ^  0. 

Then,  assuming  that  risk  measure  p  is  lower  semi-continuous  (l.s.c.),  the  risk  &(S\Xq)  of  a  set 
S  c  V  with  uncertain  vertex  weights  X,-,  i  €  V ,  can  be  defined  as  the  optimal  value  of  the  following 
stochastic  programming  problem: 

&(S\Xq)  —  minjpf  ^  UjXj  |  :  ^  Uj  —  1;  tq-  >  0,  /  e  ST.  (6) 

'  ^ ieS  '  ieS  ' 
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Note  that  this  definition  of  the  set  risk  function  M(-)  admits  risk  reduction  through  diversification  as 
illustrated  by  the  following  proposition: 

Proposition  1  ([38])  Given  a  graph  G  =  (V.  E)  with  stochastic  weights  X,,  i  e  V,  and  a  l.s.c.  risk 
measure  p,  the  set  risk  function  defined  by  (6)  satisfies 

&(S2;XG)  <  &(Si;XG)  for  all  Si  C  S2.  (7) 

The  following  observation  regarding  the  optimal  solution  of  the  risk-averse  Q  problem  (4)  stems 
directly  from  property  (7): 

Corollary  1  There  exists  an  optimal  solution  of  the  risk-averse  Q  problem  (4)  with  Sf{S\  XG)  defined 
by  (6)  that  is  a  maximal  set  with  property  Q  in  G. 

Additional  properties  of  M(S ;  XG)  as  defined  by  (6)  ensue  from  the  assumption  that  the  risk  measure 
p  belongs  to  the  family  of  coherent  measures  of  risk  [5],  i.e.,  satisfies  the  properties  of  monotonicity, 
p(X)  <  0  for  all  A  <  0;  subadditivity,  p(X  +  Y )  <  p(X )  +  p(Y );  transitional  invariance,  p(X  + 
c)  =  p(X)  +  c  for  all  c  €  M;  and  positive  homogeneity,  p( XX)  —  X p(X)  for  all  X  >  0.  Then,  the 
corresponding  set  risk  function  &(S;XG)  satisfies  analogous  properties  with  respect  to  the  stochastic 
weights  vector  XG , 

(Gl)  monotonicity.  df{S\XG)  <  3%(S\ Yg)  for  all  XG  <  YG\ 

(G2)  positive  homogeneity.  &(S;  XXG)  =  XM(S\  XG )  for  all  XG  and  X  >  0; 

(G3)  transitional  invariance :  ff(S\  XG  +  a  1 )  =  M(S\  XG )  +  a  for  all  a  e  M; 

where  1  is  the  vector  of  ones,  and  the  vector  inequality  XG  <  YG  is  interpreted  component- wise. 

Observe  that  d?,(S ;  XG)  violates  in  general  the  subadditivity  requirements  with  respect  to  the  stochas¬ 
tic  weights.  However,  risk  reduction  via  diversification  is  guaranteed  by  (7),  which  ensures  that  the  in¬ 
clusion  of  additional  vertices  to  the  existing  feasible  solution  is  always  beneficial.  Further,  under  the 
assumption  of  nonnegative  stochastic  vertex  weights,  XG  >  0,  the  set  risk  A5 (.S';  XG )  can  be  shown  to 
be  subadditive  with  respect  to  subsets  of  V, 

&(SiUS2;XG)<&(S1;XG)  +  &(S2;XG),  S,.S2c  V.  (8) 

Clearly,  it  is  required  that  Si,  S2,  and  Si  U  S2  satisfy  property  Q  in  conformance  to  the  context  of 
risk-averse  Q  problems. 

3  Solution  approaches  for  risk-averse  £-club  problems 

In  this  section,  we  first  address  the  computational  complexity  of  the  RA-/<  problem  for  any  fixed  positive 
integer  k,  and  show  that  this  problem  is  A/’T’-hard.  We  then  propose  two  exact  solution  algorithms  for 
this  problem.  First,  we  consider  a  mathematical  programming  approach  for  the  RA -k  problem,  where 
the  risk  &{S\XG)  of  a  set  S  e  Fg(A:)  is  defined  by  (6).  To  this  end,  we  take  advantage  of  a  recent 
formulation  for  the  maximum  /< -club  problem  developed  by  Veremyev  et  al.  [43].  Next,  we  propose 
a  combinatorial  branch-and-bound  algorithm  for  solving  RA-A  problem  that  utilizes  the  same  solution 
space  processing  principles  for  finding  maximum  k -clubs  as  the  ones  used  in  [13,  17,  34]. 
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In  order  to  establish  the  problem's  complexity  and  derive  the  corresponding  solution  methods,  we 
need  to  introduce  additional  assumptions  on  the  properties  of  stochastic  weights  Xq  and  risk  measure 
p  involved  in  the  definition  of  the  risk-averse  A: -club  problem  (5).  Namely,  throughout  this  section  it 
is  assumed  that  the  stochastic  weights  Xt  of  vertices  i  e  V  arc  nonnegative  and  rational-valued,  X,  : 
Q.  i->  Q+,  i  e  V,  where  Q+  denotes  the  set  of  nonnegative  rational  numbers.  Also,  the  corresponding 
probability  measure  P  is  rational-valued,  i.e.,  P{  A,  =  X,  (a>)}  e  Q+  FI  [0,  1]  for  all  <o  e  Q  and  all  i  e  V. 
Similarly,  we  assume  that  the  risk  measure  p  is  such  that  p(X)  e  Q  whenever  X  and  the  underlying 
probability  measure  are  rational-valued.  In  addition,  we  restrict  our  attention  to  risk  measures  that  arc 
expectation-bounded 1  [37],  i.e.,  such  that  p(X )  >  EX  for  all  non-constant  X,  and  p(X )  =  EX  for  all 
constant  X,  or  such  X  that  X  =  const  with  probability  1. 

3.1  Computational  complexity 

In  this  section,  we  derive  the  computational  complexity  of  the  the  risk-averse  k-club  problem  from  the 
complexity  of  the  more  general  class  of  risk-averse  Q  problems  (4). 

For  a  given  property  Q,  the  decision  version  of  risk-averse  Q  problem,  denoted  by  (G,  Xq,  p,  c ),  is 
as  follows.  Given  a  graph  G  =  (V,  E),  a  vector  of  stochastic  weights  Xq,  a  l.s.c.  risk  measure  p,  and  a 
c  €  Q,  determine  whether  there  exists  a  set  S  C  V  such  that  G[S]  satisfies  Q  and  fX(S ’,Xq)  <  c.  We 
also  consider  the  deterministic  maximum  Q  problem: 

max{|S|  :  S  C  V  and  G[5]  satisfies  Q},  (9) 

and  its  decision  version,  denoted  as  (G,  q):  given  a  graph  G  =  (V,  E)  and  an  integer  q,  is  there  a  subset 
of  V  that  has  property  Q  and  order  larger  than  ql 

Theorem  1  If  property  Q  is  such  that  the  decision  version  of  (deterministic)  maximum  Q  problem  is 
MV-hard,  then  the  decision  version  of  risk-averse  0  problem  is  also  Af'P-hard,  provided  that  the  risk 
measure  p  is  proper. ;  l.s.c.,  and  expectation-bounded. 

Proof:  The  intractability  of  the  risk-averse  Q  problem  is  proved  by  a  polynomial-time  reduction  from  the 
maximum  Q  problem.  Given  a  graph  G  =  (V.  E)  and  a  fixed  positive  integer  q,  consider  the  decision 
version  of  the  maximum  Q  problem  ( G,q ).  For  any  such  maximum  Q  decision  problem  (G,q),  we 
replicate  G  =  G  and  let  A,  for  all  i  €  V  be  a  set  of  independently  and  identically  distributed  random 
variables  with  Bernoulli  distribution,  such  that  P{  A,  =  0  j  =  P{A,-  =  1}  =  \  for  all  i  e  V .  As  a  risk 
measure,  we  select  p( A)  =  ct2( A)  +  EA,  where  ct2(A)  denotes  the  variance  of  A.  Obviously,  p( A)  is 
expectation  bounded,  as  well  as  l.s.c.  and  proper,  so  that  the  corresponding  set  risk  function  Sf  is  well 
defined.  It  is  easy  to  see  that  the  set  risk  function  M(S,  Xp.)  becomes  equal  to 

&(S,  Ag)  =  min  j o2  (  ^  Uj  A,)  4 —  :  y ^  Uj  =  1;  ut  >0,  V/  e  5| 
ieS  ieS 

1  1 

“  4|S[  +  2' 

This  procedure  constructs  in  polynomial  time  an  instance  (G,  Xq,  p,  ^  of  risk-averse  Q  problem 
such  that  there  exists  a  Q-subgraph  of  order  larger  than  q  in  G  if  and  only  if  there  exists  a  Q-subgraph 

'“Expectation-boundedness”  is  also  known  as  “aversity”  [36],  but  we  use  the  former  term  in  this  work  so  as  to  avoid 
semantic  confusion  when  referring  to  “risk-averse”  subgraphs. 
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A  11 

S  in  G  such  that  3%(S ;  XG )  <  4^  +  2  •  This  shows  that  the  decision  version  of  risk-averse  Q  problem  is 
AAP-hai'd  if  the  maximum  Q  problem  is  ATP-hard.  □ 

The  computational  complexity  of  RA -k  problem,  which  we  arc  concerned  with  in  this  work,  follows 
readily  from  Theorem  1  due  to  the  fact  that  (deterministic)  maximum  Ar-club  problem  is  known  to  be 
AT1 -hard  [9]: 

Corollary  2  The  decision  version  of  risk-averse  k-club  problem  (RA-k)  is  MV-hard,  provided  that  risk 
measure  p  is  proper,  l.s.c.,  and  expectation-bounded. 

The  condition  that  risk  measure  p  in  the  risk-averse  Q  problem  (4)  be  l.s.c.  and  proper  ensures 
that  the  resulting  set  risk  function  Sf  is  well-defined.  Expectation-boundedness,  on  the  other  hand,  is 
imposed  so  as  to  avoid  situations  in  which  the  risk-averse  Q  problem  becomes  trivial.  In  the  presented 
framework  we  advocate  for  use  of  coherent  measures  of  risk  when  constructing  the  set  risk  function  (6). 
It  turns  out,  however,  that  if  one  selects  p(X)  =  EX,  which  is  formally  a  coherent  risk  measure  yet  does 
not  measure  “risk”,  then  the  corresponding  problem  (4)  is  polynomially  solvable,  and,  moreover,  the 
solution  is  trivial.  This  can  be  viewed  as  an  additional  supporting  argument  for  pursuing  the  risk-averse 
approach  when  dealing  with  graph-theoretical  problems  on  graphs  with  stochastic  vertex  weights,  since 
the  traditional  “expectation” -based,  or  risk-neutral  approach  to  problems  with  stochastic  vertex  weights 
may  not  yield  interesting  results.  The  following  proposition  formalizes  the  above  observation. 

Proposition  2  Consider  the  risk-averse  Q  problem  (4),  where  the  risk  measure  p  is  such  that  for  any 
G  =  (V,  E),  XG,  and  S  C  V, 

argmin  jpl  ^  =  1;  w;  >  0,  Vi  e  S> 

'  '■ieS  '  ieS  '  (10) 

=  ju  e  M|S|  :  uis  =  1;  m  =0,  Vi  eS\  {i5}}, 

and  is  in  (10)  is  computable  in  polynomial  time.  Then,  the  risk-averse  Q  problem  is  polynomially 
solvable,  provided  that  property  Q  is  such  that  one  can  determine  in  polynomial  time  whether  there 
exists  a  Q-subgraph  of  G  containing  a  given  i  €  V. 

Proof:  Obviously,  condition  (10)  implies  that 

W,XG)  =  P(XG)  =  min  p(Xt). 

i  €S 

Then,  in  polynomial  time  one  can  compute  p(X;o)  =  min,e  j/  p(Xl )  and  it  can  be  verified  whether 
So  3  io  exists  such  that  G  [,S'o]  satisfies  Q.  If  not,  p(/’i)  =  min,e^Wj0}  p(A,  )  is  computed  and  existence 
of  S 1  3  i  1  such  that  G  [.S'  1  ]  satisfies  Q  is  verified  in  polynomial  time,  and  so  on.  Clearly,  the  risk-averse 
Q  problem  can  thus  be  solved  in  polynomial  time.  □ 

It  is  easy  to  see  that  p(X)  =  EX  constitutes  a  special  case  of  the  risk  measure  described  in  Proposi¬ 
tion  2,  and 

&(S,XG)  =  min  EXt . 
i&S 

On  a  related  note.  Theorem  1  also  establishes  the  computation  complexity  of  risk-averse  maximum 
hereditary  subgraph  problems  that  were  discussed  in  our  previous  work  [38].  Recall  that  property  Q  is 
called  hereditary  with  respect  to  induced  subgraphs  if  for  any  graph  G  that  satisfies  Q,  removal  of  any 
its  vertex  creates  an  induced  subgraph  that  also  satisfies  Q.  Further,  property  Q  is  called  interesting  if  the 
order  of  graphs  that  satisfy  it  is  unbounded,  and  it  is  called  nontrivial  if  it  is  satisfied  by  a  single-vertex 
graph  and  is  not  satisfied  by  every  graph  (see,  e.g.,  [47]). 
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Corollary  3  If  property  Q  is  hereditary  with  respect  to  induced  subgraphs,  interesting,  and  nontrivial, 
and  risk  measure  p  is  l.s.c.,  proper,  and  expectation-bounded,  then  the  risk-averse  Q  problem  is  Mil¬ 
liard. 

Note  that  the  A:-club  property  is  not  hereditary  with  respect  to  induced  subgraphs. 

3.2  A  mathematical  programming  formulation 

In  this  section,  we  formulate  the  RA -k  problem  as  a  (generally  nonlinear)  mixed  integer  programming 
program.  To  this  effect,  let  binary  decision  variables  Xi  indicate  whether  node  i  €  V  belongs  to  a  subset 
5: 

1,  i  €  S 
0,  otherwise. 

When  the  property  Q  denotes  a  A' -club,  one  can  choose  the  edge  formulation  of  the  maximum  A: -club 
problem  proposed  by  Veremyev  et  al.  [43],  whereby  the  mathematical  programming  formulation  of  the 
RAT  problem  takes  the  form 


min  p(Yuj  Xi) 

(11a) 

i  eV 

S.  t.  Yj  11  i  — 

(lib) 

ieV 

Ui  <  Xi,  i  e  V , 

(He) 

yfj  ’  >  Xi  +  xj  -  1,  Vi,  j  €  V,  i  ±  j. 

(Hd) 

y\f  =  0-  V(/,;)e£,  i  /  j. 

(lie) 

y^  -  yip.  v(i,j)eE,  is{  2 k}. 

(Ilf) 

yf-  <  £  v£1},  VQJ)€E,  is{ 2 k}. 

(Hg) 

t:(i,t)eE 

yffstj.  yf  =  yf.  v«.yer.  /e{i,. 

(1  lb) 

Xi  e  {0, 1},  Ui  >  0,  yf)  G  [0, 1],  Wi,j  eV,  l  e{  1, . . 

.,k}. 

(in) 

where  E  represents  the  set  of  all  complement  edges  of  graph  G.  Note  that  nonlinearity  in  (11)  is  at¬ 
tributable  to  the  possible  nonlinearity  of  the  risk  measure  p.  Appropriate  nonlinear  mixed-integer  pro¬ 
gramming  solvers  can  be  used  to  solve  formulation  (11)  provided  that  risk  measure  p  in  admits  a  suitable 
mathematical  programming  representation.  A  combinatorial  branch-and-bound  algorithm  for  solving 
RA-Ac  problem  is  described  next. 

3.3  A  combinatorial  branch-and-bound  algorithm 

The  combinatorial  branch-and-bound  (BnB)  algorithm  for  solving  problem  (1 1)  processes  solution  space 
by  traversing  “levels”  of  the  BnB  tree  to  find  a  subgraph  G  [.S']  that  represents  a  maximal  Ar-club  of 
minimum  risk  in  G  as  measured  by  (6).  The  algorithm  begins  at  level  1  =  0  with  a  part ial  solution 
0  :=  0,  incumbent  solution  O*  :=  0,  and  an  upper  bound  on  risk  L*  :=  +oo  (risk  induced  by  Q*). 
Partial  solution  Q  is  composed  of  vertices  that  may  potentially  become  a  A: -club  during  latter  stages  of 
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the  algorithm,  while  Q*  contains  vertices  corresponding  to  a  maximal  /r-club  whose  risk,  L* ,  is  the 
smallest  up  to  the  current  stage.  A  set  of  “candidate”  vertices  Q  is  maintained  at  each  level  l,  from 
which  a  certain  branching  vertex  V£  is  selected  and  added  to  the  part ial  solution  Q,  or  simply  deleted 
from  set  Q  without  being  added  to  Q.  Note  that  the  initial  candidate  set  is  Co  :=  V.  To  ensure  proper 
navigation  between  the  levels  of  the  BnB  tree,  the  notation  Pf  or  is  used  to  indicate  whether  the  last 
node  of  the  BnB  tree  at  level  l  was  created  by  adding  to  Q,  or  by  deleting  V£  from  Q  without  adding 
it  to  Q,  respectively. 

Whenever  a  BnB  tree  node  is  created  at  the  consecutive  level  l  +  1,  a  candidate  set  Q+1  is  con¬ 
structed  by  removing  all  vertices  from  Q  whose  pairwise  distances  from  the  vertices  in  Q  exceed  k  in 
the  induced  graph  G[Q  U  Q]: 

Q+i  :=  {j  e  Q  :  dQ[Quce](i,  j)  <  k,  Vi  e  Q}. 

Observe  that  the  refinement  of  Q  may  disrupt  the  structural  integrity  of  the  partial  solution  if  the  elim¬ 
inated  candidate  vertices  serve  as  distance  intermediaries  (i.e.,  comprise  the  shortest  paths)  between  the 
vertices  in  Q.  In  other  words,  the  distance  between  at  least  one  pair  of  vertices  /.  j  e  Q  exceeds  k 
upon  removal  of  one  or  more  vertices  from  Q  when  constructing  Q+i.  Due  to  this  inherent  distance- 
based  dependence  of  /r-clubs,  additional  considerations  arc  warranted  whenever  creating  a  BnB  node  by 
either  adding  or  deleting  a  vertex  vi  (i.e.,  P+  or  P^,  respectively).  Therefore,  the  necessary  structural 
properties  of  Q  and  Q+i  at  each  BnB  node  are 

(Cl)  Q  is  a  k -clique  in  G[Q  U  Q+1],  and 

(C2)  dQ[Q{jct](i<  j)  <k,  Vi  e  0,  Vj  €  Q+1. 

After  constructing  set  Q+j  (condition  (C2)  is  satisfied  by  definition  of  Q+1),  if  vertices  in  Q\Q+i 
do  serve  as  distance  intermediaries,  their  removal  imposes  violations  with  respect  to  condition  (Cl).  In 
such  cases,  Q  cannot  become  a  /r-club  by  exploring  deeper  levels  of  the  tree  and  the  corresponding  node 
of  the  BnB  tree  is  fathomed2  by  infeasibility  via  violation  of  condition  (Cl). 

Whenever  condition  (Cl)  is  satisfied,  the  next  step  entails  evaluating  the  quality  of  the  solution  that 
can  be  obtained  from  the  subgraph  induced  by  vertices  in  Q  U  Q+ 1 .  An  exact  approach  for  directly 
finding  a  /r-club  with  the  lowest  possible  risk  contained  in  G[Q  U  Q+1]  would  involve  solving  problem 
(II)  with  x,  =  0  for  all  /  e  V\(Q  UQ+1);  we  denote  the  corresponding  solution  by  S(Q  UQ+p  X(j). 
Solving  such  a  (nonlinear)  mixed  0-1  program  at  every  node  of  the  BnB  tree  is  clearly  impractical. 
Instead,  the  following  relaxation  problem  is  utilized  to  obtain  a  valid  lower  bound  on  S( Q  U  Q_|_ i ;  X(j) : 

A2  u  Q+i;Xg)  :=  min  p(  ^  ul  Xj] 

ieV 

s.t.  =  I,  (12) 

ieV 

Ui  =0,  i  €  V  \  (Q  U  Q+1) 

Hi  >  0,  i  €  Q  U  Cl+l. 

If  A0UQ+1;XG)  >  L* ,  then  the  corresponding  node  of  the  BnB  tree  is  fathomed  by  bound  due  to 
the  fact  that  sequential  refinement  can  not  achieve  a  further  reduction  in  risk. 

In  the  case  when  C{Q  U  Q+1;Xg)  <  L*  and  Q  U  Q+i  is  a  /r-club,  the  new  incumbent  solution 
will  be  Q*  =  Q  U  Ci+\  and  the  global  upper  bound  on  risk  is  updated,  L*  =  C{0  U  Q+ 1 ;  Xf; ).  In 

•^Indicated  in  Algorithm  1  by  the  assignment  “fathom  :=  True”. 
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this  case,  the  current  BnB  node  is  fathomed  by  feasibility.  If,  however,  C(Q  U  Q+i;X(j)  <  L*  and 
Q  U  C/+ 1  is  not  a  k-club,  a  branching  vertex  it£+1  is  selected  at  the  next  level  i  +  1  and  BnB  node  P++l 
will  be  processed. 

After  fathoming  a  BnB  node,  the  algorithm  backtracks  as  follows.  If  the  current  BnB  node  is  of 
type  P+,  then  the  vertex  vi  is  removed  from  Q,  and  the  node  associated  with  the  deletion  of  vi,  P7 , 
is  created.  On  the  other  hand,  if  the  BnB  node  is  of  type  P^  ,  the  algorithm  sequentially  backtracks  to 
the  last  level,  l'  <  l,  associated  with  a  node  of  type  Pf,  .  The  node  is  then  constructed  by  removing 
the  branching  vertex  vp  from  Q.  Observe  that  a  node  can  only  be  of  form  P^~,  after  P  +  has  been 
fathomed/processed. 

Empirical  observations  suggest  that  branching  on  a  vertex  vi  with  the  smallest  value  of  p(XV/;)  or 
EXVe  can  significantly  enhance  computational  performance.  To  this  end,  the  vertices  in  any  candidate 
set  Q  are  ordered  in  descending  order  with  respect  to  their  risks  p(Xl )  or  expected  values  EX/,  and 
the  last  vertex  in  Q  is  always  selected  when  adding  vertex  vi  to  the  partial  solution  Q.  The  described 
branch-and-bound  algorithm  procedure  for  RA-/<  problem  is  formalized  in  Algorithm  1 . 

As  shown  in  [17],  it  is  important  to  mention  that  the  number  of  leaf  nodes  in  the  BnB  search  tree  of 
Algorithm  1  is  0*(  1.62^1),  where  the  modified  notation  “0*(g(|F|))”  implies  0(g( \V\)  ■  poly( |F|)) 
for  some  polynomial  function  poly  (|  V\).  Additionally,  at  each  node  of  the  search  tree,  all  pair  distances 
can  be  computed  in  0( |  F|3)  time  and  we  solve  a  linear  program  to  obtain  a  lower  bound  on  the  optimal 
solution  of  the  subtree  rooted  at  that  node.  Therefore,  Algorithm  1  runs  in  0*(1.62^). 
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Algorithm  1:  Combinatorial  branch-and-bound  algorithm 

t  Initialize:  l  :=  0;  Co  :=  V;  Q  :=  0;  Q*  :=  0;  L*  =  oo;  node  :=  Pq1" ;  fathom  :=  False; 

2  while  i  >  0  do 

3  if  node  =  P^  then 

4  select  a  vertex  v%  e  Q ; 

5  Q  :=  Q  \  {t^}; 

6  Q  \=  Q  U  {v(]\ 

7  else 

s  L  Q  '■=  Q\iveY 

9  Q+t  :=  {j  6  Q  :  ^g[2uq](fj)  <  C  Vi  e  Q}-, 
to  if  Q  is  a  k-clique  in  G[0  U  Q+i]  then 
it  if  £(2  U  Q+i)  <  L*  then 

12  if  Q  U  Q+i  «  n  k-club  then 

13  0*:=gUQ+1; 

14  £*  :=£(0UCw); 

is  fathom  :=  True; 

16  else 

17  fathom  :=  True; 

is  else 

19  fathom  :=  True; 

20  if  fathom  =  True  then 

21  while  i  >  0  and  node  =  do 

22  |_  i  :=  l  —  1; 

23  node  :=  P^\ 

24  fathom  :=  False; 

25  else 

26  l  :=  l  +  1; 

27  node  :=  P+ ; 

28  return  Q*\ 

4  Case  study:  Risk-averse  /c-club  problem  with  higher  moment  coherent 
risk  measures 

In  this  section,  we  present  a  computational  framework  for  problem  (11)  and  conduct  numerical  exper¬ 
iments  demonstrating  the  computational  performance  of  the  proposed  BnB  algorithm.  To  this  end,  we 
adopt  higher  moment  coherent  risk  measures  to  quantify  the  risk  as  described  next. 

4.1  Higher  moment  coherent  risk  measures 

The  class  of  higher-moment  coherent  risk  (HMCR)  measures  was  introduced  in  [27]  as  optimal  values 
to  the  following  stochastic  programming  problem: 

HMCRa>p(A)  =  min  r]  +  (1  —  a)-1 1|  (X  —  rj)+  ||  ,  a  €  (0, 1),  p  >  1,  (13) 
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where  X+  =  max{0,  X}  and  ||X||p  =  ( E  |  X  | /J ) 1 '  p .  Mathematical  programming  problems  that  contain 
HMCR  measures  can  be  formulated  using  /(-order  cone  constraints.  Typically,  in  stochastic  program¬ 
ming  models,  the  set  of  random  events  Q  is  assumed  to  be  discrete,  £2  =  { co i cujy},  with  the 

probabilities  P {co^}  —  Tck>  0,  and  n\+  •••  +  =  1.  The  corresponding  mathematical  programming 

model  (11)  with  p(X)  =  HMCR^q,  (X)  takes  the  following  mixed  0-1  p-order  cone  programming  form: 

min  r]  +  (1  —  a)~1to 
s.  t.  t0  >  11(11,...,^)!^, 

Tc^llpyk>  ^UiXjk-ri,  k  =  1, ,  N, 

ieV 

y  uj  =  i.  (i4) 

ieV 

Ui<Xi,  ieV, 

(lid) -(lli), 

tk  >  0,  k  =  0, ...  ,N, 

where  Xik  represents  the  realization  of  the  stochastic  weight  of  vertex  i  e  V  under  scenario  k  € 
{1 . N}.  Analogously,  the  lower  bound  problem  (12)  takes  the  form 

C(Q  U  Q+1;Xg)  =  min  p  +  (l-a)_1?0 

s.  t.  to  >  ||it . tN\\p, 

y1/ptk  >  yujXik  -  Tj.  k  =  l,...,N, 

ieV 

y  uj  =  i,  (i5) 

i€V 

u  i  ^  0 ,  i  G  Q  U  _|_  i , 
iii  =0,  i  €  V  \  (Q  U  C£+ 1), 
tk>  0,  k  -  0 . N. 

For  instances  when  p  e  {1, 2},  problems  (14)  and  (15)  reduce  to  linear  programming  (LP)  and  second 
order  cone  programming  (SOCP)  models,  respectively.  However,  in  cases  when  when  p  e  (l,2)U(2,oo) 
the  p-cone  is  not  self-dual  and  there  exist  no  efficient  long-step  self-dual  interior  point  solution  methods. 
Consequently,  we  employ  solution  methods  for  p-order  cone  programming  problems  that  arc  based  on 
polyhedral  approximations  of  p-order  cones  [45]  and  representation  of  rational-order  p-cones  via  second 
order  cones  [31]. 

4.2  Setup  of  the  numerical  experiments  and  results 

Numerical  experiments  of  the  risk-averse  k-club  problem  for  k  —  2,  3,  4  were  conducted  on  randomly 
generated  Erdos-Renyi  graphs  of  orders  \V\  =  50,  100,200  with  average  densities  D ( G )  =  0.0125, 
0.025,  0.05,  0.1,  and  0.15.  The  specified  densities  were  chosen  due  to  empirical  observations  indicating 
that  a  graph  of  order  |  V  \  >  50  commonly  reduces  to  a  2-club  when  the  density  is  in  the  range  [0.15,0.25]. 
Clearly,  this  effect  is  even  more  pronounced  for  k  >  2.  The  stochastic  weights  of  graphs’  vertices  were 
generated  as  i.i.d.  samples  from  the  uniform  (7(0, 1)  distribution.  Scenario  sets  with  N  =  250  scenarios 
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were  generated  for  each  combination  of  graph  order  and  density.  The  HMCR  risk  measures  (13)  with 
p  =  1, 2,  3,  and  a  =  0.9  were  used. 

The  BnB  algorithm  has  been  coded  in  C++,  and  we  used  the  CPLEX  Simplex  and  B  airier  solvers 
for  the  polyhedral  approximations  and  SOCP  reformulations  of  the  /(-order  cone  programming  lower 
bound  problem  (15),  respectively  (see  [25]).  For  instances  when  p  =  1,  the  CPLEX  Simplex  solver  was 
utilized  to  solve  problem  (15)  directly.  The  computations  were  conducted  on  an  Intel  Xeon  3.30GHz  PC 
with  128GB  RAM,  and  CPLEX  12.6  solver  in  Windows  7  64-bit  environment  was  used. 

The  computational  performance  of  the  mathematical  programming  model  (14)  was  compared  with 
that  of  developed  BnB  algorithm.  In  the  case  of  p  =  1,  problem  (14)  was  solved  with  CPLEX  Mixed 
Integer  Programming  (MIP)  solver.  The  CPLEX  MIP  Barrier  solver  was  used  for  the  SOCP  version  in 
the  case  of  p  —  2,  and  using  the  SOCP  reformulation  in  the  case  of  p  =  3. 

Tables  1-  3  present  the  computational  times  and  the  best  objective  values  averaged  over  five  instances 
for  each  graph  configuration,  as  well  as  the  number  of  instances  for  which  an  optimal  solution  was 
attained  within  a  3600  second  time  limit.  The  reported  average  time  is  calculated  by  only  considering  the 
instances  where  the  problem  was  solved  to  optimality  within  the  time  limit,  while  the  reported  average 
objective  value  is  calculated  by  only  considering  the  instances  in  which  at  least  a  feasible  solution  is 
found  within  the  time  limit.  The  symbol  “ — ”  was  used  to  indicate  that  the  time  limit  was  exceeded, 
and  cells  containing  “NA”  correspond  to  instances  for  which  solution  process  failed  due  to  CPLEX 
running  out  of  memory.  Table  1  demonstrates  that  the  BnB  algorithm  significantly  outperforms  the 
CPLEX  MIP  solver  over  all  the  listed  graph  configurations  when  k  —  2,  achieving  up  to  an  order  of 
magnitude  of  improvement  in  computational  time.  Further,  observe  that  the  quality  of  the  average  best 
objectives  obtained  by  the  BnB  algorithm  was  superior  whenever  both  methods  failed  to  reach  an  optimal 
solution  within  the  time  limit.  In  cases  when  CPLEX  failed  due  to  memory  capacity  issues,  the  BnB 
algorithm  either  attained  an  optimal  solution  or  an  incumbent  solution,  in  which  case  the  average  solution 
associated  with  the  best  incumbent  solutions  are  provided.  Note  that  the  performance  of  both  algorithms 
decreases  for  higher  values  of  p.  This  becomes  particularly  pronounced  for  p  =  3  and  \V\  =  200 
in  Table  1,  where  CPLEX  could  not  manage  any  of  the  corresponding  instances  due  to  the  increased 
problem  size  associated  with  the  cutting-plane  algorithm  for  solving  polyhedral  approximations  of  p- 
order  cone  programming  problems,  while  the  BnB  algorithm  only  solved  eleven  instances  within  the 
time  limit. 

A  similar  improvement  in  performance  can  be  observed  for  k  =  3  and  k  =  4  in  Tables  2-3.  As  k 
increases,  the  number  of  time  limit  and  memory  capacity  limit  violations  for  CPLEX  increases,  further 
demonstrating  the  applicability  of  the  proposed  BnB  method.  This  observable  disadvantage  associate 
with  model  (11)  results  from  the  fact  that  the  number  of  constraints  in  model  (11)  rapidly  increases  with 
k,  thus  overwhelming  the  solver  in  many  cases.  All  the  instances  in  Table  3  with  \  V\  =  200  are  of  this 
type. 

Based  on  the  results  presented  in  Tables  1-3,  it  is  worth  noting  that  as  D(G)  increases  for  a  given  p 
and  \V\,  the  average  computation  time  for  the  BnB  algorithm  increases,  reaches  a  maximum  value,  and 
then  decreases.  This  is  due  to  the  fact  that  once  D(G )  is  large  enough,  graph  G  tends  to  contain  larger 
components  of  lower  diameter  that  can  be  detected  at  the  early  stages  of  the  BnB  algorithm.  Another 
interesting  observation  is  that  for  a  given  p  and  D(G),  if  \  V\  is  large  enough,  the  average  computation 
time  for  BnB  algorithm  decreases  as  \  V\  increases.  For  instance,  in  Table  2,  for  p  =  2  and  D ( G )  =  0.1, 
none  of  the  instances  with  \  V\  =  100  were  solved  to  optimality,  while  all  the  instances  with  \  V\  =  200 
were  solved  to  optimality  within  4.05  seconds  on  average.  This  observation  can  be  justified  by  the  fact 
that  for  a  given  expected  edge  density  D(G),  if  |  V\  is  sufficiently  large,  the  diameter  of  the  random  graph 
decreases  as  \  V\  increases  (see,  e.g.,  [11],  p.  62).  Therefore,  in  these  cases,  the  graphs  with  larger  \  V\ 
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tend  to  have  larger  components  of  low  diameter  that  can  likewise  be  detected  during  the  early  stages  of 
the  BnB  algorithm. 

In  order  to  demonstrate  the  applicability  of  our  algorithms  on  real-life  graphs.  Tables  4-  6  present  the 
results  obtained  from  solving  various  DIMACS  graph  instances  with  the  same  number  of  scenarios  and 
distribution  of  uncertain  vertex  weights  as  above.  Observe  that  the  BnB  method  outperforms  CPLEX 
over  the  vast  majority  of  tested  instances,  and  more  than  two  orders  of  magnitude  in  improvements 
were  observed  for  various  cases.  However,  in  several  cases  even  the  BnB  algorithm  failed  to  obtain  an 
incumbent  solution  within  the  time  limit  (denoted  by  “oo”),  underscoring  the  complex  nature  of  many 
real-life  graphs. 


P=  1 

P  =  2 

P  =  3 

\v\ 

\v\ 

\v\ 

D(G) 

Algorithm 

Output 

50 

100 

200 

50 

100 

200 

50 

100 

200 

Time  (s) 

0.95 

4.46 

61.64 

32.18 

91.33 

3043.69 

129.07 

278.05 

NA 

CPLEX 

Instance 

5 

5 

5 

5 

5 

1 

5 

5 

0 

0.0125 

Objective 

0.23 

0.21 

0.19 

0.28 

0.25 

0.37 

0.30 

0.25 

NA 

Time  (s) 

0.23 

1.04 

5.01 

8.63 

25.13 

80.27 

35.64 

100.08 

301.68 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.23 

0.21 

0.19 

0.28 

0.25 

0.21 

0.30 

0.25 

0.21 

Time  (s) 

1.44 

7.49 

177.34 

46.96 

233.35 

— 

93.66 

352.55 

NA 

CPLEX 

Instance 

5 

5 

5 

5 

5 

0 

5 

5 

0 

0.025 

Objective 

0.23 

0.20 

0.17 

0.28 

0.23 

0.54 

0.29 

0.23 

NA 

Time  (s) 

0.24 

1.11 

7.62 

10.90 

30.35 

167.98 

51.66 

141.57 

746.82 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.23 

0.20 

0.17 

0.28 

0.23 

0.19 

0.29 

0.23 

0.19 

Time  (s) 

1.92 

14.64 

2185.63 

60.53 

472.10 

— 

123.17 

776.51 

NA 

CPLEX 

Instance 

5 

5 

5 

5 

5 

0 

5 

5 

0 

0.05 

Objective 

0.20 

0.18 

0.15 

0.23 

0.19 

0.20 

0.24 

0.19 

NA 

Time  (s) 

0.26 

2.02 

37.88 

15.43 

75.50 

1087.92 

65.07 

368.43 

3051.66 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

1 

Objective 

0.20 

0.18 

0.15 

0.23 

0.19 

0.16 

0.24 

0.19 

0.16 

Time  (s) 

4.25 

322.01 

— 

150.96 

— 

— 

423.76 

— 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

0 

0 

5 

0 

0 

0.1 

Objective 

0.18 

0.15 

0.14 

0.20 

0.18 

0.41 

0.20 

0.18 

NA 

Time  (s) 

0.59 

28.51 

— 

38.03 

1451.43 

— 

183.78 

— 

— 

BnB 

Instance 

5 

5 

0 

5 

5 

0 

5 

0 

0 

Objective 

0.18 

0.15 

0.14 

0.20 

0.16 

0.15 

0.20 

0.16 

0.15 

Time  (s) 

9.48 

2832.38 

— 

1055.83 

— 

— 

1862.11 

— 

NA 

CPLEX 

Instance 

5 

2 

0 

5 

0 

0 

5 

0 

0 

0.15 

Objective 

0.17 

0.14 

0.18 

0.18 

0.16 

0.17 

0.18 

0.16 

NA 

Time  (s) 

2.41 

2033.67 

— 

164.06 

— 

— 

707.16 

— 

— 

BnB 

Instance 

5 

3 

0 

5 

0 

0 

5 

0 

0 

Objective 

0.17 

0.14 

0.18 

0.18 

0.14 

0.20 

0.18 

0.15 

0.22 

Table  1:  Average  computation  times  (in  seconds),  number  of  instances  solved  to  optimality  (out  of  five) 
and  the  average  best  objective  values  obtained  by  solving  problem  (1 1)  using  the  proposed  BnB  algorithm 
and  CPLEX  with  k  —  2  and  risk  measure  (13). 
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P=  1 

P  =  2 

p  =  3 

|T| 

\v\ 

\v\ 

D(G) 

Algorithm 

Output 

50 

100 

200 

50 

100 

200 

50 

100 

200 

Time  (s) 

0.88 

6.12 

NA 

14.16 

148.36 

NA 

86.09 

258.75 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

5 

0 

5 

5 

0 

0.0125 

Objective 

0.22 

0.19 

NA 

0.27 

0.22 

NA 

0.27 

0.21 

NA 

Time  (s) 

0.23 

1.04 

6.80 

8.84 

29.01 

162.51 

36.64 

113.63 

708.45 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.22 

0.19 

0.16 

0.27 

0.22 

0.18 

0.27 

0.21 

0.18 

Time  (s) 

1.26 

12.68 

NA 

27.07 

516.78 

NA 

69.69 

577.47 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

5 

0 

5 

5 

0 

0.025 

Objective 

0.21 

0.18 

NA 

0.24 

0.19 

NA 

0.25 

0.19 

NA 

Time  (s) 

0.24 

1.59 

81.50 

11.28 

65.54 

2075.28 

52.72 

286.93 

— 

BnB 

Instance 

5 

5 

5 

5 

5 

4 

5 

5 

0 

Objective 

0.21 

0.18 

0.15 

0.24 

0.19 

0.15 

0.25 

0.19 

0.15 

Time  (s) 

2.79 

287.23 

NA 

163.45 

— 

NA 

385.90 

— 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

0 

0 

5 

0 

0 

0.05 

Objective 

0.17 

0.14 

NA 

0.19 

0.17 

NA 

0.19 

0.16 

NA 

Time  (s) 

0.43 

44.13 

— 

29.41 

1060.88 

— 

131.34 

1531.64 

— 

BnB 

Instance 

5 

5 

0 

5 

4 

0 

5 

1 

0 

Objective 

0.17 

0.14 

0.14 

0.19 

0.15 

0.18 

0.19 

0.15 

0.17 

Time  (s) 

14.27 

25.41 

NA 

2656.62 

3425.15 

NA 

2797.45 

3311.62 

NA 

CPLEX 

Instance 

5 

2 

0 

5 

1 

0 

2 

1 

0 

0.1 

Objective 

0.15 

0.11 

NA 

0.15 

0.11 

NA 

0.15 

0.11 

NA 

Time  (s) 

3.00 

719.53 

3.70 

367.80 

— 

4.05 

941.06 

— 

4.96 

BnB 

Instance 

5 

3 

5 

5 

0 

5 

5 

0 

5 

Objective 

0.15 

0.11 

0.10 

0.15 

0.11 

0.10 

0.15 

0.11 

0.10 

Time  (s) 

50.30 

480.23 

NA 

2329.51 

— 

NA 

1003.57 

— 

NA 

CPLEX 

Instance 

5 

5 

0 

2 

0 

0 

3 

0 

0 

0.15 

Objective 

0.13 

0.11 

NA 

0.13 

0.12 

NA 

0.13 

0.12 

NA 

Time  (s) 

2.50 

0.29 

4.63 

762.31 

0.50 

4.94 

998.11 

1.51 

5.89 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

4 

5 

5 

Objective 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

Table  2:  Average  computation  times  (in  seconds),  number  of  instances  solved  to  optimality  (out  of  five) 
and  the  average  best  objective  values  obtained  by  solving  problem  (1 1)  using  the  proposed  BnB  algorithm 
and  CPLEX  with  k  —  3  and  risk  measure  (13). 

5  Conclusions 

We  have  considered  an  RA -k  problem  which  entails  finding  a  k-club  of  minimum  risk  in  a  graph.  HMCR 
risk  measures  were  utilized  for  quantifying  the  distributional  information  of  the  stochastic  factors  asso¬ 
ciated  with  vertex  weights.  It  was  shown  that  the  decision  version  of  RA -k  problem  is  A/’T’-hard  for  any 
fixed  positive  integer  k,  and  the  optimal  solutions  are  maximal  /r -clubs.  A  combinatorial  BnB  solution 
algorithm  was  developed  and  tested  on  a  special  case  of  RA -k  problem  when  k  =  2,3,4.  Numeri¬ 
cal  experiments  on  randomly  generated  graphs  of  various  configurations  suggest  that  the  proposed  BnB 
algorithm  can  significantly  reduce  solution  times  in  comparison  with  the  mathematical  programming 
model  solved  using  CPLEX  MIP  solver. 
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P=  1 

P  =  2 

P  =  3 

\v\ 

\v\ 

\v\ 

D(G) 

Algorithm 

Output 

50 

100 

200 

50 

100 

200 

50 

100 

200 

Time  (s) 

0.90 

7.30 

NA 

27.09 

206.30 

NA 

118.25 

299.46 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

5 

0 

5 

5 

0 

0.0125 

Objective 

0.21 

0.17 

NA 

0.25 

0.18 

NA 

0.26 

0.19 

NA 

Time  (s) 

0.23 

1.13 

15.13 

8.25 

31.98 

409.93 

47.41 

160.66 

2001.38 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.21 

0.17 

0.14 

0.25 

0.18 

0.15 

0.26 

0.19 

0.15 

Time  (s) 

1.60 

21.97 

NA 

50.10 

1475.67 

NA 

79.91 

1714.83 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

5 

0 

5 

5 

0 

0.025 

Objective 

0.19 

0.15 

NA 

0.22 

0.16 

NA 

0.23 

0.16 

NA 

Time  (s) 

0.23 

2.46 

— 

11.58 

91.57 

— 

63.31 

514.33 

— 

BnB 

Instance 

5 

5 

0 

5 

5 

0 

5 

5 

0 

Objective 

0.19 

0.15 

0.12 

0.22 

0.16 

0.13 

0.23 

0.16 

0.13 

Time  (s) 

4.34 

— 

NA 

461.18 

— 

NA 

929.33 

— 

NA 

CPLEX 

Instance 

5 

0 

0 

5 

0 

0 

5 

0 

0 

0.05 

Objective 

0.16 

0.12 

NA 

0.16 

0.12 

NA 

0.16 

0.12 

NA 

Time  (s) 

0.66 

728.07 

2.71 

35.37 

— 

3.06 

177.83 

— 

4.23 

BnB 

Instance 

5 

5 

5 

5 

0 

5 

5 

0 

5 

Objective 

0.16 

0.12 

0.10 

0.16 

0.12 

0.10 

0.16 

0.12 

0.10 

Time  (s) 

33.72 

1776.46 

NA 

898.37 

449.72 

NA 

493.52 

500.88 

NA 

CPLEX 

Instance 

5 

4 

0 

3 

3 

0 

4 

3 

0 

0.1 

Objective 

0.13 

0.13 

NA 

0.13 

0.11 

NA 

0.13 

0.11 

NA 

Time  (s) 

4.63 

0.25 

3.71 

187.73 

0.47 

4.07 

236.05 

1.72 

5.33 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

Time  (s) 

25.40 

2503.89 

NA 

282.75 

— 

NA 

271.86 

— 

NA 

CPLEX 

Instance 

5 

5 

0 

5 

0 

0 

5 

0 

0 

0.15 

Objective 

0.13 

0.11 

NA 

0.13 

0.12 

NA 

0.13 

0.12 

NA 

Time  (s) 

0.04 

0.30 

4.63 

0.22 

0.53 

4.97 

1.54 

1.95 

6.42 

BnB 

Instance 

5 

5 

5 

5 

5 

5 

5 

5 

5 

Objective 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

0.13 

0.11 

0.10 

Table  3:  Average  computation  times  (in  seconds),  number  of  instances  solved  to  optimality  (out  of  five) 
and  the  average  best  objective  values  obtained  by  solving  problem  (1 1)  using  the  proposed  BnB  algorithm 
and  CPLEX  with  k  =  4  and  risk  measure  (13). 
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oo  0.10 

NA  21.19 

NA  0.10 
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0.26  0.26 
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Mixed-Integer  Programming  with  a  Class  of  Nonlinear 

Convex  Constraints 


Alexander  Vinel*  Pavlo  A.  Krokh  trial + 


Abstract 

We  study  solution  approaches  to  a  class  of  mixed-integer  nonlinear  programming  problems  that 
arise  from  recent  developments  in  risk-averse  stochastic  optimization  and  contain  second-order  and 
p- order  cone  programming  as  special  cases.  We  explore  possible  applications  of  some  of  the  solution 
techniques  that  have  been  successfully  used  in  mixed-integer  conic  programming  and  show  how  they 
can  be  generalized  to  the  problems  under  consideration.  Particularly,  we  consider  branch-and-bound 
method  based  on  outer  polyhedral  approximations,  lifted  nonlinear  cuts,  and  linear  disjunctive  cuts. 
Results  of  numerical  experiments  with  discrete  portfolio  optimization  models  are  presented. 

Keywords:  Mixed-integer  nonlinear  programming,  measures  of  risk,  branch-and-bound,  valid  in¬ 
equalities,  conic  programming 


1  Introduction 


In  this  work  we  consider  solution  approaches  to  a  special  class  of  mixed-integer  nonlinear  optimization 
problems  that  includes,  among  others,  mixed  integer  second-  and  /? -order  cone  programming  problems. 
Developing  the  corresponding  solution  approaches  can  also  be  viewed  as  a  way  to  explore  applicabil¬ 
ity  of  some  of  the  methods  extensively  used  in  mixed-integer  conic  programming  literature  in  a  more 
general  setting.  While  our  interest  in  the  particular  class  of  problems  studied  here  stems  from  recent 
developments  in  risk-averse  stochastic  optimization  (Vinel  and  Krokhmal,  2014b;  Rysz  et  al.,  2014), 
similar  models  may  arise  in  other  fields  of  science  and  engineering  in  the  context  of  “generalized  means” 
(see  below).  Namely,  in  the  present  study  we  consider  mixed-integer  nonlinear  programming  problems 
of  the  form 


min  cTx 


(la) 


s.  t. 


vk  1  ^  £  Pjvk  (  £  afjXi  + 

Hx  <  h 

xeZ"+'x  M”2, 


<  £  af()Xi  +  b\v 


k  =  1 . K 


(lb) 

(lc) 

(ld) 
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where  n  =  n  \  +  n 2  is  the  dimensionality  of  the  mixed-integer  decision  vector  x,  and  c,  h,  H  arc  vectors 
and  a  matrix  of  appropriate  dimensions. 

The  main  object  of  interest  in  problem  (1)  is  the  set  of  nonlinear  constraints  (lb),  where  it  is  assumed  that 
coefficients  pj  arc  positive,  p1-  >  0,  for  all  values  of  j  and  k,  and  functions  %  :  I  k  1  ,k~  1 .....  A', 
have  the  following  properties: 

(i)  Vk(t)  =  0  for  t  <  0, 

(ii)  V]f  (t  )  arc  increasing  and  convex  for  t  >  0, 

(iii)  Ujt  are  such  that  constraints  (lb)  arc  convex. 

To  simplify  the  exposition  and  notation,  in  what  follows  we  arc  going  to  suppress  index  k  in  (lb), 
effectively  considering  problem  (1)  with  a  single  nonlinear  constraint,  K  —  1.  Then,  given  the  above 
assumptions  on  function  v,  it  is  straightforward  to  see  that  problem  (1)  can  be  rewritten  in  the  form 

min  cTx 

s.  t.  Wo  > 

Wj  > 

Wo  < 

Hx  < 

The  expression  in  the  right-hand  side  of  the  nonlinear  constraint  (2b)  is  well  known  in  the  litera¬ 
ture  under  the  names  of  quasi-arithmetic ,  Kolmogorov,  or  Kolmogorov -Nagumo  mean  of  the  sequence 
{w  1 . wm  | ,  provided  that  the  positive  coefficients  pj  satisfy  p\  +  . . .  +  pm  =  1  (see,  for  exam¬ 

ple,  Bullen  et  al.,  1988;  Hardy  et  al.,  1952).  In  the  operations  research  and  economics  domains,  it  is 
related  to  the  concept  of  certainty  equivalent  (Wilson,  1979;  McCord  and  Neufville,  1986),  or  the  de¬ 
terministic  quantity  such  that  a  rational  decision  maker  with  a  utility  function  v  would  be  indifferent 

between  choosing  this  certain  quantity  or  a  random  outcome  W  that  may  have  realizations  w  1 . wm 

with  probabilities  p\ , . . . ,  pm . 

In  the  present  work,  our  interest  in  solving  problems  of  the  form  (l)-(2)  derives  from  risk-averse  stochas¬ 
tic  optimization  models  that  employ  the  certainty  equivalent  measures  of  risk  (Vinel  and  Krokhmal, 
2014b,  see  also  Sections  2  and  5).  This  application  also  dictates  the  above  requirements  (i)-(iii)  on 
functions  v.  At  the  same  time,  it  is  easy  to  see  that  conditions  (i)— (iii)  naturally  imply  that  the  nonlin¬ 
ear  convex  constraint  (lb)  represents  a  direct  generalization  of  the  second-order  cone,  or,  more  broadly, 
p-order  cone  constraints  wo  >  || (w\, . . . ,  wm)\\p. 

Formulation  (1)  without  the  integrality  constraints  has  been  previously  considered  in  Rysz  et  al.  (2014). 
That  work  concentrates  on  linear  constraints  (2c),  particularly  in  the  case  when  the  value  of  m  is  large, 
which  in  the  stochastic  programming  setting  corresponds  to  a  large  number  of  scenarios  (see  Section 
2).  This  computational  challenge  have  been  addressed  by  employing  an  efficient  scenario  decomposi¬ 
tion  framework.  In  the  present  endeavor  we  focus  our  attention  on  the  challenges  associated  with  the 
nonlinear  and  integrality  constraints  in  (1).  From  this  point  of  view,  problem  (1)  can  be  characterized 

2 


(m 

E  PjV(Wj) 
j= 1 


JfajjXi+bj,  j  =  l, ...  ,m 
i  =  1 
n 

E  a-ioxi  +  bo 

i  =  i 

h,  w  >  0,  xeZjxM^2. 


(2a) 

(2b) 

(2c) 

(2d) 

(2e) 
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as  a  mixed-integer  nonlinear  programming  (MINLP)  problem  with  a  convex  continuous  relaxation,  and 
there  exists  an  extensive  body  of  literature  discussing  solution  methods  for  either  general  MINLP  or 
mixed-integer  conic  programming  (MICP).  Since  the  formulation  considered  here  is  in  some  sense  “in 
between”  of  these  two  classes,  our  discussion  is  concentrated  on  attempts  to  utilize  the  specific  structure 
of  the  nonlinear  constraint.  While  constraint  (2b)  is  no  longer  necessary  conic,  in  our  discussion  below 
we  will  show  that  some  of  the  solution  procedures  proposed  for  second-  or  /; -order  cone  programming 
(SOCP  or  pOCP)  problems  can  be  extended  to  this  class  as  well. 

Development  of  both  of  the  most  widely  used  approaches  in  mixed-integer  programming  (branch-and- 
bound  algorithm  and  valid  inequalities)  in  relation  to  problem  (2)  will  be  addressed  in  this  paper.  We 
begin  by  discussing  risk-averse  stochastic  programming  motivation  for  this  problem  in  Section  2.  In 
Section  3  we  present  a  version  of  branch-and-bound  method  targeted  at  the  specific  nonlinear-  constraints 
considered  in  this  paper.  Next,  in  Section  4  we  will  address  two  procedures  for  generating  inequalities 
valid  for  the  feasible  set  of  (2):  lifted  nonlinear-  cuts  and  disjunctive  cuts.  Finally  in  Section  5  we  will 
present  some  results  of  numerical  experiments.  Relevant  literature  review  will  be  presented  in  Sections 
3  and  4. 

In  terms  of  developed  solution  procedures  the  main  contributions  of  this  paper  in  our  view  are  the  fol¬ 
lowing.  First,  we  show  that  two  techniques  (a  special  implementation  of  a  branch-and-bound  and  lifted 
nonlinear-  valid  inequalities)  that  have  been  proposed  in  the  context  of  mixed-integer  second-order  cone 
programming  (MISOCP)  problems  can  be  extended  to  the  more  general  case  considered  here.  While, 
both  of  this  extensions  do  not  require  novel  theoretical  developments,  heavily  relying  on  the  results 
already  established  in  the  literature,  the  novelty  of  the  problem  formulation  justifies,  in  our  view,  our 
interest  in  these  extensions.  Particularly,  we  show  how  these  techniques  can  be  reformulated  in  order 
to  address  this  new  application  area,  while  still  allowing  for  the  use  of  the  already  existing  theoretical 
basis.  Secondly,  we  propose  another  numerical  approach,  which  relies  on  a  simple  geometric  idea  for 
construction  of  linear  disjunctive  cuts.  To  the  best  of  our  knowledge  this  particular-  scheme  has  not  been 
considered  in  the  literature  before. 

2  Risk- Averse  Stochastic  Programming  Motivation 

Consider  a  function  p  :  I  h>  ID  {+oo},  where  X  is  an  appropriate  linear-  space  of  W-measurable 
functions  on  a  probability  space  (Q ,  T ,  P)  such  that  X  —  X(co)  €  X  is  interpreted  as  a  random  outcome 
representing  a  cost  or  loss  associated  with  the  uncertain  event  co  €  £2 .  Then,  function  p  is  referred  to  as  a 
risk  measure,  and  defines  a  system  of  preferences  on  X  (outcome  X  is  preferred  to  Y  iff  p(X )  <  p(Y )). 
Additionally,  suppose  that  outcome  X  depends  on  the  value  of  a  decision  vector  x  e  X.  In  this  case 
a  problem  of  optimal  decision  making  under  uncertainty  can  be  formulated  as  a  (risk-averse)  stochastic 
programming  problem 

nrin{c(x)  |  p(X(x,  <w))  <  /?(x),  x  6  X}.  (3) 

Problems  of  this  kind  involving  various  forms  of  risk  measure  p  have  been  extensively  studied  in  the 
literature,  see,  e.g.,  Krokhmal  et  al.  (201 1)  for  a  survey.  In  the  context  of  this  paper  we  are  concerned  with 
a  particular  type  of  certainty  equivalent  measures  of  risk,  introduced  in  Vinel  and  Krokhmal  (2014b), 
which  are  defined  as 

p{X)  :=  min  r\  +  - - u_1En([A  -  rj\+), 

v  1  —  a 
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where  the  deutility  function  v  is  nondecreasing,  convex,  such  that  i;_l  Ei;(  A )  is  convex,  and  v(t )  = 
v  ([?]+)  =  i'(niax{0,  t !  ).  The  class  of  certainty  equivalent  measures  of  risk  possesses  important  method¬ 
ological  characteristics,  such  as  convexity,  isotonicity  with  respect  to  stochastic  dominance  ordering 
induced  by  deutility  function  v  (and,  in  particular,  second-order  stochastic  dominance),  etc.,  and  con¬ 
tains  some  well-known  risk  measures  as  special  cases,  including  CVaR  (Rockafellar  and  Uryasev,  2002) 
and  HMCR  (Krokhmal,  2007). 

Certainty  equivalent  measures  of  risk  arc  amenable  to  simple  implementation  in  stochastic  programming 
models  via  constraints  of  the  form  (2b)  if  the  set  of  random  events  G  can  be  assumed  finite:  £2  = 

{to i . com}  and  P { tOj }  =  Pj  >  0  for  j  =  1, . . . ,  m .  Then,  stochastic  programming  problem  (3)  can 

be  equivalently  reformulated  as 


min 


rj  +  (l  —  a)  lv  ~  *?]+))  -  Mx)>  x  e  X,  r]  e  r|. 


(4) 


If,  additionally,  it  can  be  assumed  that  the  loss  function  X(x,co)  is  linear  with  respect  to  the  decision 
vector,  i.e.,  X(x,  op)  =  ajx  +  bj,  and  x  e  Z”1  x  M”2,  then  (4)  can  be  written  as  a  special  case  of 
MINLP  (2) 


min  c  (x) 

(5a) 

s.  t.  q  +  (1  —  a)-1  wo  <  h(x) 

(5b) 

wo  >  v_1(  E  Pjv(wj )) 

(5c) 

Wj  >  ajx  +  bj  —  q,  j  —  1 . m 

(5d) 

x  e  Z"1  x  M”2,  w  >  0,  t|6l, 

(5e) 

provided  that  c (x)  and  h (x)  are  linear  as  well.  In  view  of  the  above,  we  refer  to  constraint  (2b)  as  the 
certainty  equivalent  constraint. 


3  Branch-and-Bound  based  on  Outer  Polyhedral  Approximations 

3.1  Existing  Methods  and  Approach  due  to  Vielma  et  al  (2008) 

Branch-and-bound  (BnB)  methods  for  solving  MINLP  problems  arc  often  divided  into  two  categories 
depending  on  the  way  continuous  relaxations  arc  handled.  The  first  group  consists  of  the  methods  which 
solve  exact  non-linear  continuous  relaxation,  usually  using  some  version  of  an  interior  point  method 
(see,  for  example  Gupta  and  Ravindran,  1985;  Borchers  and  Mitchell,  1994;  Leyffer,  2001  and  references 
therein).  Alternatively,  polyhedral  approximations  can  be  employed  to  help  with  finding  approximate  so¬ 
lutions  of  the  continuous  relaxations  (Duran  and  Grossmann,  1986;  Fletcher  and  Leyffer,  1994;  Quesada 
and  Grossmann,  1992;  Bonami  et  ah,  2008;  Vielma  et  al.,  2008).  This  approach  has  been  the  basis  for 
a  few  MINLP  solvers  such  as  Bonmin  (Bonami  et  al.,  2008),  FilMINT  (Abhishek  et  al.,  2010)  or  AOA 
(AIMMS  open  MINLP  solver).  For  example,  outer  approximation  algorithms  (AOA)  solve  alternating 
sequence  of  MILP  master  problems  and  NLP  subproblems,  while  in  LP-NLP-based  BnB  methods  (Que¬ 
sada  and  Grossmann,  1992,  FilMINT)  the  solution  of  a  single  master  mixed-integer  linear  programming 
(MILP)  problem  is  terminated  every  time  an  integer  valued  candidate  is  found  to  solve  an  exact  NLP, 
solution  of  which  is  then  used  to  generate  new  outer  approximations. 
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Another  framework  has  been  proposed  by  Vielma  et  al.  (2008)  for  the  case  of  mixed-integer  second  order 
cone  programming  (MISOCP)  problems.  The  authors  exploit  the  fact  that  there  exists  an  extremely  ef¬ 
ficient  lifted  outer  polyhedral  approximation  of  second  order  cones,  and  thus  propose  to  solve  full-sized 
approximating  LP  at  each  node  of  the  master  MILP,  while,  as  previously,  an  exact  NLP  is  solved  every 
time  a  new  integer  solution  is  found.  Note  that  in  this  case,  the  algorithm  is  guaranteed  to  find  a  solu¬ 
tion  that  is  £-feasible  to  the  relaxation  at  each  node  of  the  BnB  tree,  as  opposed  to  LP-NLP  approach, 
where  NLP  solution  is  used  to  generate  new  approximating  facets.  Hence,  one  of  the  key  differences 
between  different  implementations  of  such  BnB  methods  can  be  viewed  as  a  trade-off  between  the  size 
of  approximating  LPs  (i.e.,  the  accuracy  of  the  approximation)  and  the  number  of  exact  NLPs  that  need 
to  be  solved.  Note  that  an  exact  NLP,  of  course,  provides  tighter  lower  bounds,  and  thus,  more  pruning 
capabilities,  while  LPs  bring-in  superior  warm-start  efficiencies,  consequently  speeding  up  the  process¬ 
ing  time  in  each  node.  In  this  sense,  the  approach  of  Vielma  et  al.  (2008)  can  be  viewed  as  the  most 
conservative  in  terms  of  the  use  of  the  exact  solvers:  NLPs  are  only  solved  when  absolutely  necessary  to 
verify  incumbent  integer  solutions. 

The  fact  that  this  approach  relies  on  an  efficient  lifted  approximation  scheme  is  essential,  since  other¬ 
wise  exponentially  large  polyhedral  approximations  may  be  required  to  achieve  guaranteed  e-feasibility 
for  general  nonlinear  constraints.  The  main  source  of  difficulty  here  can  be  associated  with  high  di¬ 
mensionality  of  the  constraint,  i.e.,  it  can  be  seen  as  a  manifestation  of  the  “curse  of  dimensionality”. 
In  Vinel  and  Krokhmal  (2014c)  we  have  shown  that  this  framework  can  be  competitive  even  when  no 
such  efficient  approximation  scheme  is  available  by  designing  a  branch-and-bound  based  on  polyhedral 
approximations  for  mixed-integer  /? -order  cone  programming  (MIpOCP)  problems.  The  key  idea  there 
was  the  introduction  of  a  cutting  plane  generation  procedure  for  approximately  solving  continuous  pOCP 
relaxations.  In  the  next  subsection  we  arc  going  to  demonstrate  that  a  similar  approach  is  applicable  in 
the  more  general  setting  considered  in  the  current  paper.  In  fact,  certainty-equivalent  constraints  can  be 
naturally  viewed  as  the  most  general  setting  which  still  allows  for  direct  application  of  the  considered 
dimensionality  reduction  techniques. 


3.2  Lifted  Approximation  Procedure 


In  the  context  of  MISOCP  problems,  efficient  (in  the  dimensionality  and  number  of  facets)  polyhedral 
approximations  of  second-order  cones  due  to  Ben-Tal  and  Nemirovski  (2001)  arc  available,  which  arc 
constructed  via  a  two-step  procedure.  During  the  first  step,  a  lifting  technique,  dubbed  by  the  authors 
“tower  of  variables” ,  was  used  to  express  the  high-dimensional  second-order  cone  set  via  a  number  of 
two-dimensional  second-order  cones,  and  then  a  clever  lifting  approximation  procedure  was  applied  to 
the  resulting  low-dimensional  second-order  cone  sets.  In  our  previous  work  (Vinel  and  Krokhmal,  2014c) 
dealing  with  general  p- order  cones,  the  second  step  of  this  procedure  was  replaced  by  a  simpler  gradient- 
based  approximation,  which  could  be  constructed  via  an  efficient  cutting  plane  procedure.  In  the  current 
endeavor,  we  again  resort  to  the  first-step  lifting  procedure  due  to  Ben-Tal  and  Nemirovski  (2001),  and 
then  investigate  the  problem  of  constructing  polyhedral  approximations  of  the  resulting  low-dimensional 
sets  using  a  cutting  plane  technique. 

Let  us  denote  set  described  by  constraint  (2b)  as 


V('”+1)  :=  ]  w  e  M™+1 


W  0  >  V 


(6) 


where,  in  order  to  unclutter  the  notation,  we  omit  the  dependence  of  V + 1  ■  on  the  parameters  pj  and 
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function  v.  We  will  call  a  set  of  form  (6)  “F-set”.  Note  also  that  from  here  on  we  assume  that  w  e  M™+1 
in  order  to  simplify  the  exposition.  Analogous  analysis  can  be  conducted  when  this  condition  does  not 
hold. 

Proposition  3.1  (Tower-of-variables).  Given  pj  >  0,  j  =  1 _ _  m ,  and  a  function  v  that  satisfies  as¬ 

sumptions  (i)-(hi),  there  exist  values  fi, . . . ,  f2m-2  >  0  such  that  the  projection  of  the  2m -dimensional 
set 

V(2m)  :=  {w6l2+m  [»o  =  w2m-u 

wm+j  >  n_1  (P2j-iv(w2j-i)  +  f}2jv(w2j)),  j  =  1 , . . . , m  —  1 } , 

onto  the  space  of  variables  w» . wm  equals  the  setV^m+1\  Moreover,  fij  can  be  selected  in  such  a 

way  that  f2j-\  +  f>2j  =  1  for  j  =  1 . m  —  1. 


Proof  As  it  has  been  noted  above,  the  set  of  inequalities  in  (7)  defines  a  structure  that  can  be  referred  to 
as  tower-of-variables,  where  each  variable  wj  is  represented  by  a  node,  and  edges  connect  node  Wj+m 
with  w2j-i  and  w2j.  Let  us  define  sets  T )•  as  Tj  =  {j  |  if  j  =  1, . . . ,  m  and  T m+j  =  T2/-1  U  T2j 

for  j  =  1 . m  —  1.  In  other  words,  set  T j  is  the  subset  of  indexes  (I .... ,  m  j  corresponding  to  the 

initial  (non-lifting)  variables  descending  from  wj  in  the  tower-of-variables.  In  this  case,  let  us  take 


faj-i 


E  Pk 

k  £  Y2 j—i 

E  Pk 

fc€T2y_iUT22 


faj  ~ 


E  Pk 

keT2j 

E  Pk' 

k  ^  P2 7  —  1 U  T2  j 


j  =  1 . m  —  1 . 


(8) 


Now,  the  claim  of  the  proposition  can  be  verified  directly. 


□ 


Remark  Proposition  3.1  represents,  perhaps,  the  most  general  version  of  the  original  “tower-of- 
variables”  scheme  of  Ben-Tal  and  Nemirovski  (2001)  proposed  for  second-order  cone  sets.  Note  also 
that  the  choice  of  vector  fi  ensuring  that  the  claim  above  holds  is  not  unique.  The  particular-  approach 
proposed  in  (8)  guarantees  that  f2j-\  +  f>2  j  =  1,  ensuring  that  each  of  the  inequalities  in  (7)  describes 
a  proper  V -set. 


Proposition  3.1  reduces  the  problem  of  constructing  a  polyhedral  approximation  for  (m  +  1  )-dimensional 
F-set  (6)  to  that  for  m  —  1  three-dimensional  (3D)  F-sets  V®  in  (7), 


V(3)  :=  {w  e  |  w0  >  f(m,w2)},  (9) 

where  f{w\,w2)  :=  c-1  (f>\  v(v:i )  +  f2v(w2)).  More  importantly,  it  drastically  reduces  the  dimen¬ 
sionality  of  the  resulting  polyhedral  approximation;  instead  of  the  generally  exponential  in  m  number  of 
hyperplanes  needed  for  approximation  of  set  y  (m+1),  only  0{mk)  hyperplanes  is  required  to  approxi¬ 
mate  the  lifted  set  V (2m).  provided  that  each  3-dimensional  F -set  in  (7)  can  be  approximated  with  0(k ) 
hyperplanes. 

In  this  respect,  it  is  necessary  to  comment  on  the  precise  definition  of  approximation  that  we  will  use  in 
this  work.  Namely,  we  consider  set 


1 1  (m+l)  ._ 
V  £ 


,  W  G 


(1  +  s)v(w0)  >  E  PjV(Wj ) 


7  =  1 
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and,  accordingly,  its  three  dimensional  version 

V£(3)  :=  {w  e  M+  |  (1  +  e)v(wo)  >  fiiv(wi)  +  jS2v(w2)}.  (10) 

Observe  that  such  a  choice  of  approximating  condition  allows  us  to  connect  the  approximation  quality  of 
a  single  three-dimensional  constraint  in  the  tower-of-variables  construction  with  the  multi-dimensional 
case. 

Proposition  3.2.  Consider  set  V  and  its  lifted  representation  If  each  of  the  triples  in 

representation  V  satisfies  ( wm+j ,  w2j-i,  w2j)T  6  for  a  given  e  >  0,  then  (w o . wm)T  e 

Vg(m-l-t),  where  e  <  (1  +  e)  ^]°S2  —  1  =  |"log2  /V]e  +  0{e2). 

Proof  The  claim  can  be  verified  directly  by  expanding  the  tower-of-variables  (see  also  Vinel  and 
Krokhmal,  2014c,  Proposition  3.2).  □ 

(m  I  If 

Along  with  the  primary  definition  Vg  of  ^approximation  of  a  V -set,  we  also  consider  two  additional 
approximation  approaches 

V£(3)  :=  {w  e  M+  |  v (( 1  +  e)w0)  >  fiiv( uq)  +  fi2v(w2)},  (11) 

=  0)  r  ,  . 

Vg  {w  e  |  v(wo  +  s)  >  f\v(w\)  +  fi2v(w2)}.  (12) 

—  (3) 

The  set  Ve  is  a  direct  extension  of  the  usual  approximation  used  in  the  case  of  conic  sets  (see,  for 

=  0) 

example,  Ben-Tal  and  Nemirovski  (2001)),  while  set  Vf,  represents  an  absolute  error  £-approximation 
of  F-set.  It  should  be  emphasized  here  that  only  condition  in  (10)  allows  for  a  natural  accuracy  propaga¬ 
tion  analysis  for  the  tower-of-variables  construction  as  in  Proposition  3.2.  The  other  two  approximating 
conditions  will  be  used  in  the  discussion  establishing  finiteness  of  the  proposed  computational  procedure 
below. 

Since  the  relaxed  feasible  set  considered  in  the  current  work  is  convex,  a  cutting  plane  defined  as 

WO  >  f(w*,W2)  +  f1!Ul(w*,W2)(w1-W*)  +  f^2(w*,W2)(w2-W2),  (13) 

which  is  tangent  to  the  3-dimensional  set  V®  at  point  (  f(w* ,  w^),  w*.  w^),  is  globally  feasible.  Hence, 
the  following  general  framework  can  be  applied.  We  will  consider  a  master  problem  in  the  form  of  (2), 
where  nonlinear  constraint  is  substituted  with  a  set  of  cutting  planes  (13): 

min  cTx 

S.  t.  Wm+ j  >  f  (w*J ,  w27 )  +  fwi  (^l7 »  W2J )  (w2y-l  -  w\J  ) 

+  fw2(wiJ  >W2J)(W  2j  -  j  =  1 . m  -  I ,  kj  =  1 . Kj, 

(2c)-(2e), 

where  Kj  is  the  number  of  cutting  planes  on  variables  wm+j ,  w2j-i,  w2j  for  all  j,  derived  around 

(k  ■  k  ■  \ 

wf  ,  w2J  ),  kj  =  \, ...  Kj.  Then,  given  a  cuiTent  solution  w*  of  the  master  problem,  we 
can  add  new  constraints  around  pairs  (t v2j-i’  w2j)’  ^or  ^1<)SC  j  f°r  which  the  selected  approximation 
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condition  is  violated.  Afterwards,  the  master  can  be  resolved  and  the  iterative  process  continues.  Next 
we  show  that  this  procedure  terminates  after  a  finite  number  of  iterations  with  a  solution  that  satisfies  the 
prescribed  approximation  accuracy,  assuming  that  the  feasible  sets  considered  arc  bounded.  As  it  turns 
out,  an  additional  auxiliary  approximation  scheme  may  be  required. 

From  here  on  we  will  assume  that  v(t)  =  atp  +  o (l p)  if  t  — >■  0.  The  following  simple  lemma  will 
be  useful  below.  We  will  omit  the  proof  of  this  result  since  it  can  be  obtained  using  standard  calculus 
techniques. 

Lemma  3.3.  If  function  v  is  finite,  strictly  increasing  and  convex  on  t  >0  and  v(t)  —  atp  +  o  (tp)  as 
t  — >  0,  then  u_1(r)  =  a~x^p r^p  +  o(x^p). 


This  claim  allows  us  to  establish  asymptotic  behavior  of  function  /  around  zero.  Indeed,  observe  that 
f(wi,w2)  =  v(w\)  +  f2v(w2))  =  awf  +  /32aw%  +  o(wf  +  w%))  =  (fiiwf  + 

fj2w2)'  p  +  o(||(uq,  w2)\\ p)-  Moreover,  since  function  /  is  convex,  any  plane  tangent  to  this  /7-order 
cone  defined  by  constraint  w o  >  (fi i  wp  +  p  is  a  supporting  plane  for  epi  /  by  definition. 

In  view  of  this  we  propose  the  following  auxiliary  approximation  scheme:  whenever  the  current  solution 
of  the  master  is  such  that  ||(uq,  w2)\\2  <  0,  then  in  addition  to  the  regular  constraint  described  above, 
add  a  cutting  plane  tangent  to  the  p-order  cone,  i.e., 


w  o  > 


o  1/ p 

V  w  1 


cos^-1  e " 


(cosP  9*  +  sin-P  9*)1~1!p 


+  f2PW2 


sm 


p- 1 


( me  P 


T)  /3  sk  \  1  —  1  In' 


9*  =  arctan 


a 

Pi  w2 


where  0  is  a  preselected  parameter.  The  analysis  above  implies  that  this  cut  does  not  violate  the  original 
certainty-equivalent  constraint,  and  moreover,  as  it  will  be  demonstrated  below,  this  approach  guarantees 
convergence  of  the  proposed  cutting  plane  procedure. 


Proposition  3.4.  Suppose  that  for  a  given  solution  w*  of  the  master,  cuts  in  the  form  of  (13)  are  added 

around  all  triples  (w*l+J-,  w2j-v  w2j^  £  Vg  ,  where  j  e  {1 . m  —  1}  and  the  described  above 

auxiliary  approximation  scheme  is  applied.  Assuming  that  the  feasible  region  is  bounded,  this  cutting 
plane  procedure  terminates  after  a  finite  number  of  iterations  for  any  given  e  >  0. 


Note  that  in  this  proposition  we  implicitly  exclude  cases  when  the  original  problem  is  infeasible  but  its 
£-approximation  in  the  sense  (10)  is  feasible  for  every  e  >  0.  Conditions  that  guarantee  this  arc  given 
below  in  Proposition  3.9. 

Before  verifying  the  statement  of  Proposition  3.4,  we  establish  a  few  subsidiary  lemmas. 

=  (3)  (3) 

Lemma  3.5.  If  set  VE  is  used  in  the  cutting  plane  scheme  described  above  instead  of  Vg  ,  then  the 

process  terminates  in  a  finite  number  of  iterations  even  without  the  auxiliary  scheme. 


Proof.  The  claim  follows  directly  from  the  fact  that  a  bounded  convex  set  in  three-dimensions  can  be 
approximately  described  by  a  number  of  supporting  planes  (one  can  derive  this  result  directly  by  consid¬ 
ering  Taylor’s  expansion  of  /).  □ 

Lemma  3.6.  Ifv(t)  =  \t\p ,  then  for  any  e  >  0  there  exists  c  >  0  such  that  C  v/3"*  and  vice  versa, 
i.e.,  for  p-order  cones  the  conditions  in  (10)  and  (1 1)  are  equivalent. 
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Proof.  Clearly,  for  any  e  >  0  there  exists  c  >  0  such  that  (1  +  e)p  —  1  +  e,  which  directly  implies  the 
claim  of  the  lemma.  □ 

Lemma  3.7.  For  any  s  >  0  and  8E  >  0  there  exists  e  >  0  such  that 

- (3) 

V€  n  {(lUo,  wi.w2)t  €  M3  I  (fiiwf  +  P2w^)^p  >  4}  C  V£(3). 


Proof.  Let 

-i  fa  t  \  i  n  ,  ^  -lfPMWi)  +  p2v(w2) 

e  =  mm  v  (Piv(wi)  +  p2v(w2))  -  v  - — - 

(h\wP+p2wf)l'p>8e  V  !  +  e 

Observe  that  the  minimum  above  is  attained  and  is  strictly  positive,  since  we  assume  that  the  feasible 
region  is  bounded  and  n_1  is  strictly  increasing.  Now,  if  w o  >  v_1  (fiv(wi)  +  f\v(w2))  —  e  and 
{f  i  wf  +  p2  w^)1^  >  8e,  then  wo  >  v~r  which  implies  the  claim  of  the  lemma.  □ 

Lemma  3.8.  Consider  set  {w  e  |  (1  +  e)w;('/p  >  Piw'/P  +  which  is  analogous 

to  V’e  ’  for  a  p-order  cone.  Then,  for  any  e  >  0  there  exist  e  >  0  and  8  >  0  such  that 

Ve0)  0  {(w0,  wi,  w2)t  €  M3  |  (fiiwf  +  p2w%)l/p  <  8}  C  V£(3). 


Proof.  In  order  to  establish  this  result  we  need  to  show  that  there  exist  e  and  8  such  that  if 
(jSttuf  +  <  8  and  w0  >  } — >  then  wt0  >  As  we 

have  observed  above,  v~1(fiiv(wi)  +  p2v(w2 ))  =  (f\ wf  +  fi2 w^)x'p  +  o(||(iui,  w2)\\p).  Hence, 
v-i^it)(u>i)+fe»(^2)^  _  /p  +  g(Wi  w2),  where  g(w1,w2)  =  o(||(wi,  w2)\\p).  Then, 

there  exists  5,  such  that  (jSiwf  +  p2w2  )'^  implies  | g(uq,  102) |  <  (fii wf  +  )‘^(  ^  i//;  — 

~~yjp  ^  ■  Consequently,  it  follows  that  u-1  ^£lk02l}±|2k0g2l^  <  —  for  such  ( w  j .  w2). 


(l+£ 


Then  the  claim  of  the  lemma  is  satisfied  if  we  take  e  such  that  (1  +  e)  =  (1  +  e/2 yp. 


□ 


Proof  of  the  Proposition.  Assume  to  the  contrary  that  the  cutting  plane  procedure  does  not  terminate  af¬ 
ter  a  finite  number  of  iterations.  Then,  for  at  least  one  triple  (wm+j,  w2j-\ ,  w2j )  the  approximation 
condition  is  violated  infinitely  many  times  and,  therefore,  infinitely  many  cutting  planes  are  generated. 
Let  us  denote  this  triple  as  (u>o,  W\ ,  w2).  First,  suppose  that  there  exists  A  such  that  current  solution 
(w>q  \  w{,\ u>2  "*)  of  the  master  at  iteration  i  satisfies  ||  (utj  ,  u;^)  ||  >  A  for  infinitely  many  itera¬ 

tions.  Consider  e  defined  in  Lemma  3.7  for  e  and  8E  =  A/2.  By  Lemma  3.5,  after  a  finite  number 

of  iterations  the  current  solution  satisfies  (u>q\  w^\  w^)  6  Ve  ,  which  by  Lemma  3.7  implies  that 

(u>q  \  w['\  w'2  *)T  €  Vg  3)  contradicting  with  our  assumption.  Hence,  this  sequence  of  solutions  con¬ 
verges  to  zero. 

In  Vinel  and  Krokhmal  (2014c),  Proposition  6  we  have  shown  that  a  finite  number  cutting  planes  is 
sufficient  to  achieve  any  preselected  accuracy  (1 1)  in  the  case  of  /(-order  cone  constraints.  In  the  current 
context  it  implies  that  solution  w^  \  w^)T  6  for  any  preselected  e  after  finitely  many 

applications  of  the  auxiliary  cuts,  where  Ve  is  defined  in  Lemma  3.8.  Taking  into  account  Lemmas  3.6 
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and  3.8  this  implies  that  (wq  \  w^\  w^)T  e  Ve®  for  all  sufficiently  small  w^)  ,  i.e,  (wi,  W2 ) 

does  not  converge  to  zero,  completing  the  proof.  □ 

Observe  that  the  result  of  Proposition  3.4  essentially  provides  an  exact  algorithm  for  solving  problem  (2). 
Indeed,  once  a  solution  with  a  desired  accuracy  s  is  found,  an  improved  solution  can  be  constructed  by 
adding  new  cutting  planes.  In  Vinel  and  Krokhmal  (2014c)  a  similar  result  has  been  established,  namely  it 
has  been  shown  that  a  cutting  plane  approximation  procedure  is  guaranteed  to  terminate  with  ^-feasible 
solution  in  0(e~ 1 )  iterations  for  /(-order  cone  programing  and  0(£~°  5)  in  the  case  of  second-order 
cones.  Yet,  an  upper  bound  on  the  number  of  iterations  that  can  be  obtained  from  the  proof  presented 
here  would  be  excessively  high  due  to  the  way  this  proof  is  constructed.  It  can  be  shown  that  this  bound 
is  at  least  not  better  than  0(e-1'5),  where  the  corresponding  “big-O”  constant  is  very  large.  At  the  same 
time,  all  our  experiments  with  both  conic  and  non-conic  problems  suggest  that  in  practice  only  a  small 
fraction  of  all  possible  facets  is  generated,  i.e.,  the  fact  that  this  bound  can  be  very  restrictive  may  not  be 
detrimental  to  real-life  computational  performance. 

In  conclusion  of  our  discussion,  we  would  like  to  comment  on  the  relation  between  the  feasible  sets  of 
the  initial  nonlinear  model  and  the  presented  approximated  problem.  Let  us  denote  as  Fcas(  V)  the  set 
defined  by  constraints  (lb)  and  (lc)  and  as  Fcas(F)  the  approximation  of  Feas(F)  according  to  (10). 
Next,  we  establish  the  conditions  that  guarantee  that  these  feasible  sets  are  “close”  to  each  other  (note 
that  it  is  possible  to  find  examples  where  Fcas(  V)  is  empty,  while  Fcas(  Ve)  is  not).  Following  the  results 
presented  in  Ben-Tal  and  Nemirovski  (2001),  Section  4  for  the  case  of  second-order  cone  approximation 
we  can  formulate  the  following  result,  which  we  present  without  a  proof  since  the  arguments  in  Ben-Tal 
and  Nemirovski  (2001)  apply  here  as  well. 

Proposition  3.9.  Assume  that  the  problem  under  consideration  is:  ( i)  strictly  feasible,  i.e.,  there  exist  x 
and  r  >  0  such  that 

Hx  <  h,  v~1(^jfpjv(ajx  +  bj)^j  <  ajx  +  bo  -  r,  (14a) 

and  (ii)  “semibounded”,  i.e.,  there  exists  R  >  1  such  that 

Hx  <  h,  pjv(ajx  +  bjfj  <  a0x  +  b0  aQX  +  h0<^-  (14b) 

Then  for  every  e  >  0  such  that  y(e)  =  Re/  r  <  1,  one  has  y(e)x  +  (1  —  y(s))  Fcas(  If)  C  Feas(F)  C 
Feas(Fg). 

3.3  Branch-and-Bound  Method 

Now  that  an  efficient  approximation  procedure  for  solving  continuous  relaxations  is  determined,  it  can  be 
incorporated  in  a  branch-and-bound  method  due  to  Vielma  et  al.  (2008).  Namely,  we  consider  a  master 
mixed- integer  linear  programming  (MILP)  problem  (denoted  as  Pi),  which  is  constructed  from  problem 
(2)  by  substituting  (2b)  with  a  set  of  initial  cutting  planes  of  the  form  (13).  The  solution  procedure 
consists  of  applying  a  regular  branch-and-bound  method  to  Pi ,  with  two  adjustments.  First,  lower  bounds 
obtained  from  the  continuous  relaxations  of  Pi  are  found  by  applying  the  approximation  scheme  due  to 
Proposition  3.4  with  a  preselected  value  of  e  =  S\.  Note  that  it  is  not  necessary  to  remove  any  of 
the  added  cutting  planes  before  proceeding  to  the  next  node  of  the  solution  tree,  since  these  constraints 
are  globally  feasible.  Second,  when  an  integer-valued  solution  of  Pi  is  found,  in  order  to  check  its 
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feasibility  with  respect  to  the  exact  nonlinear  formulation  and  declare  incumbent  or  branch  further,  the 
exact  continuous  relaxation  of  Pi  must  be  solved  with  bounds  on  the  relaxed  values  of  variables  x 
determined  by  the  integer-valued  solution  in  question  (see,  Vielma  et  al.,  2008  for  more  details  and 
formal  analysis).  In  order  to  solve  the  exact  relaxation,  we  once  again  employ  Proposition  3.4,  that  is  to 
say,  we  construct  a  second  problem  P2,  which  represents  a  continuous  relaxation  of  (2).  In  this  case,  we 
solve  it  using  the  same  cutting  plane  procedure  due  to  Proposition  3.4  but  with  e  =  s2  <5C  £1  instead.  A 
sufficiently  small  value  of  £2  guarantees  an  essentially  exact  solution. 

Note  that  it  has  been  previously  observed  (see,  Vielma  et  al.,  2008;  Vinel  and  Krokhmal,  2014c)  that  £1 
can  be  selected  to  be  relatively  large  and  still  provide  promising  computational  results,  which  explains  the 
relation  £2  <?C  £1  above.  Note  also  that  in  this  case  the  described  procedure  can  be  viewed  as  a  repetitive 
resolving  of  relatively  small-scale  LP  problems  P\ ,  which  can  benefit  from  warm-start  routines,  guided 
by  a  regular  branch-and-bound,  with  occasional  calls  to  a  larger-scale  P2. 


4  Valid  Inequalities 

4.1  Existing  Approaches 

It  is  well-known  in  the  literature  that  valid  inequality  theory  has  been  essential  in  development  of  efficient 
solvers,  particularly  in  mixed-integer  linear  programming  (MILP).  Building  on  this  success,  various 
approaches  to  generating  valid  inequalities  have  been  proposed  for  mixed-integer  nonlinear  programming 
(MINLP)  problems.  To  name  a  few:  Atamturk  and  Narayanan  (2010,  2011)  have  proposed  mixed-inter 
rounding  (MIR)  and  conic  lifted  cuts  for  conic  programming  problems;  Stubbs  and  Mehrotra  (1999) 
studied  cutting  plane  theory  in  0-1  mixed-convex  programming;  Cczik  and  Iyengar  (2005)  proposed 
Chvatal-Gomory  cuts  in  conic  programming;  Bonami  (2011)  have  considered  lift-and-project  cuts.  There 
have  also  been  a  series  of  publications  addressing  possible  approaches  to  designing  disjunctive  (or  split) 
cuts  in  MINLP  (for  example,  Saxena  et  al.  2008;  Burer  and  Saxena  2012;  Cadoux  2010;  Kihnc  et  al. 
2010;  Modaresi  et  al.  2015  among  others). 

In  this  section  we  consider  two  approaches  for  generation  of  valid  inequalities  for  the  MINLP  problem 
(2).  First,  we  discuss  lifted  nonlinear  cuts  building  on  the  developments  in  Atamturk  and  Narayanan 
(2011)  and  Vinel  and  Krokhmal  (2014a).  Afterwards,  we  will  present  a  simple  geometric  argument  that 
allows  us  to  construct  a  class  of  linear  disjunctive  cuts  valid  for  our  feasible  set. 

4.2  Lifted  Nonlinear  Cuts 

A  lifting  procedure  for  conic  mixed-integer  programming  has  been  proposed  in  Atamturk  and  Narayanan 
(2011).  Authors  introduced  a  lifting  scheme,  which  provides  a  way  of  generating  new  conic  valid  in¬ 
equalities  for  mixed-integer  conic  sets.  We  have  employed  this  approach  for  solving  MIpOCP  problems 
in  Vinel  and  Krokhmal  (2014a)  and  obtained  promising  numerical  results  for  a  class  of  risk-averse  port¬ 
folio  optimization  models.  While  this  technique  has  been  proposed  as  a  way  to  generate  conic  cuts  for 
conic  feasible  sets,  we  show  next  that  it  can  be  extended  for  L-sets  as  well.  As  it  will  be  clear  from 
our  discussion  below,  our  main  contribution  here  lies  in  the  reformulation  of  the  procedure  in  nonconic 
terms,  while  all  of  the  proofs  directly  follow  from  the  previous  developments  in  Atamturk  and  Narayanan 
(2011);  Vinel  and  Krokhmal  (2014a). 
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In  this  section  we  will  closely  follow  the  notation  introduced  in  Atamtiirk  and  Narayanan  (2011).  Once 
again,  consider  set  \>(m+P  defined  by  (6).  It  is  going  to  play  the  role  of  conic  feasible  set  used  in 
Atamtiirk  and  Narayanan  (201 1).  We  can  then  define 


Tn(  b) 


6  X'' 


b-  E  A; xl  e  V(™+1) 


7=0 


(15) 


where  each  X1  is  a  mixed-integer  set  in  R”«  and  A'  and  b  arc  of  appropriate  dimensions.  It  is  also 
assumed  that  Oel'  for  all  i .  Suppose  that  u  :  R  i->  R  satisfies  the  same  assumptions  as  function  v  and 
construct  set  U(m+ 1  )  analogously  to  Let  us  further  assume  that  inequality  g  —  F°x°  e 

is  valid  for  T°(b).  Atamtiirk  and  Narayanan  (201 1)  show  how  this  inequality  can  be  lifted  by  computing 

F£  e  R”£  for  l  —  1 _ _  i  so  that  cut  g  —  E  F^x^  e  js  va]j(j  for  pi  (5)  when  sets  V^w+1^ 

l=o 

and  1  *  are  proper  cones.  Then,  the  following  theorem  can  be  shown  to  hold  for  a  lifting  set  <f>'  (v) 
defined  as 


<S>'(v)  :=  der 


g  -  jr  F'  x'  -de  U{m+ !>  for  all  (x° . xf)  6  V  (b  -  v) 


7=0 


Recall  also  that  a  parametrized  set  <L(v)  is  called  superadditive  on  Mm  if  T(u)  +  d>(v)  C  <f>(u  +  v)  for 
all  u  and  v,  where  <J>(u)  +  Tfv)  denotes  the  usual  Minkowski  sum. 

Theorem  4.1.  1.  <E>!  (v)  is  closed  and  convex. 

2.  0  6  (0) 

3.  4>,  +  1(v)  C  4>'  (v) 

4.  F1 ,Fl  +  1  generate  a  valid  inequality  for  T'  +  1  (b)  iff  F,  +  1xl  e  4>'(A  l  +  1xl )  for  all  xl . 

5.  If  Q(\)  C  4>o(v)  is  superadditive,  then  F1 .....  F'  +  1  generate  a  valid  inequality  for  T' + 1  (b) 
whenever  F'  +  1x'  e  ^2  ( A' + 1  x' )  for  all  xl . 


Proof.  Since  the  arguments  establishing  the  analogous  results  in  Atamtiirk  and  Narayanan  (201 1)  do  not 
rely  on  the  conic  assumption,  we  believe  it  to  be  unnecessary  to  repeat  those  here.  □ 

As  it  was  noted  above,  we  employed  an  analogous  result  for  the  case  of  /(-order  cones  (i.e.,  V(m+ 1  *  = 
{(a'o.  x)  e  Mm+1  |  xo  >  1 1  x [ | ^  J  )  in  Vinel  and  Krokhmal  (2014a).  As  it  turns  out,  our  results  regarding 
lifted  cuts  presented  there  can  be  carried  through  without  major  changes  in  case  of  L-scts.  Particularly, 
we  can  consider  set  Tn  (b)  as 


T’\b)  :=  j  (x,  y,  t)  eZ|xK2+ 

and  then  show  that  the  following  claim  holds. 

Proposition  4.2.  Inequality 


E  ai  xi  ~  b 

L  7  =  1 


+  v(y)  <  v(t)\ 


(1  -  f)(x-  [b\)+  E  <*i*i 

i  =  1 


+  v(y)  <  v(t) 
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is  valid  for  Tn(b),  where  [a]+  =  max{().  a},  a, 
a  constant  such  that  x,  <  M  for  all  i. 


\al-h+  |AI(1  -/)] 


M 


+ 


/ 


b  —  [b\,  and  M  is 


This  result  is  a  very  limited  application  of  Theorem  4.1.  Indeed,  here  we  arc  considering  the  case  when 
the  set  Ut'm+ 1  1  is  the  same  as  the  initial  set  and,  moreover,  not  only  all  the  analysis  is  restricted 

to  three-dimensional  nonlinear  constraints,  but  also  the  second  dimension  (represented  by  variable  y)  is 
assumed  to  be  continuous  (in  other  words,  integral  structure  of  the  second  dimension  is  relaxed).  Despite 
these  simplifications,  it  was  demonstrated  in  Vinel  and  Krokhmal  (2014a)  that  such  an  approach  may 
yield  promising  computational  results  in  MIpOCP  problems.  In  Section  5  we  will  numerically  analyze 
this  procedure  in  mixed-integer  programming  with  certainty  equivalent  constraints.  In  fact,  two  of  these 
stipulations  can  be  viewed  as  natural  assumptions  for  the  task  of  deriving  valid  inequalities  in  our  case. 
Observe  that  due  to  the  tower-of-variables  technique  presented  in  Section  3  the  constraints  arc  already 
represented  in  three-dimensional  form,  and  furthermore,  it  is  also  highly  undesirable  from  computational 
perspective  to  consider  )  different  from  initial  set  since  this  would  result  in  additional 

numerical  challenges  associated  with  the  new  type  of  nonlinearity  introduced  to  the  problem. 


4.3  Linear  Disjunctive  Cuts 

Throughout  this  section  we  will  use  the  following  notation:  x  =  (a'o.x)  e  R"+ 1 .  We  will  also  reformu¬ 
late  sets  defined  y  certainty  equivalent  constraints  as 

xe/C,  K  :=  {x  e  M"+1  |  F(x)  <  x0},  F(x)  :=  tT1  (  £  u(|ajx  +  xeZ".  (17) 

/  m  x 

Note  that  here  we  consider  F(x)  :=  i>-1^  u(|aTx  +  bj\)J  instead  of  possible  F(x)  := 

,  m  , 

n_1(  v([ajx  +  bj  ] + )  1 ,  which  would  be  in  accordance  with  the  stochastic  programming  motiva- 

yj  =  t  ' 

tions.  Such  a  choice  simplifies  some  of  our  development  below,  and  since  it  results  in  a  relaxed  set  1C, 
any  valid  inequality  obtained  for  1C  will  be  valid  for  problem  (2)  as  well. 

Disjunctive  or  split  cuts  have  been  extensively  studied  in  the  literature,  especially  when  applied  to  MIP 
problems  (Balas,  1971).  This  approach  is  based  on  a  very  intuitive  idea:  consider  disjunction  xy  < 

jtq  V  Xfc  >  Tii  —  jvq  +  1  with  no  e  Z-4-,  where  k  €  1 . n  is  preselected.  Due  to  integrality  condition 

there  arc  no  feasible  solutions  outside  of  this  disjunction,  hence,  system  (17)  implies  that 

x  e  conv 

Consequently,  any  inequality  describing  this  convex  hull  is  valid  for  the  feasible  region  of  (17).  More¬ 
over,  in  the  case  of  mixed-integer  linear  programming  (MILP)  all  the  sets  involved  (including  the  con¬ 
vex  hull  above)  arc  polyhedral,  which  substantially  simplifies  the  construction  procedures,  and  hence, 
increases  the  effectiveness  of  the  cuts.  There  also  exists  a  considerable  amount  of  literature  on  general¬ 
izing  this  approach  for  MINLP  problems  (Burer  and  Saxena,  2012;  Cadoux,  2010;  Kill nc  et  al.,  2010). 
Recently,  various  efforts  to  design  nonlinear  disjunctive  cuts  have  also  been  presented  (see  Andersen 
and  Jensen,  2013;  Belotti  et  al.,  2012;  Bienstock  and  Michalka,  2014;  Modaresi  et  ah,  2015;  Burer  and 
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Kilinc-Karzan,  2014;  Kilinc-Karzan,  2015;  Kilinc-Karzan  and  Yildiz,  2015).  In  some  cases  (see,  for  ex¬ 
ample,  Modaresi  et  al.,  2015)  it  may  be  possible  to  describe  the  convex  hull  (18)  using  a  single  nonlinear 
constraint;  in  particular,  such  a  description  is  available  for  second-order  conic  sets.  Note  that  many  of  the 
works  mentioned  above  look  at  the  problem  in  settings  considerably  more  general  than  described  here. 
Next,  we  will  study  applicability  of  the  disjunctive  cut  framework  to  sets  of  form  (17). 

The  first  question  that  we  could  ask  here  is  whether  it  might  be  better  to  aim  at  finding  a  closed-form 
nonlinear  description  of  (18)  following  one  of  the  recent  developments  mentioned  above,  or  whether  a 
simpler  linear  description  could  be  more  useful  in  this  case.  Note  that  if  such  a  nonlinear  description  is  to 
be  found  and  then  used  in  a  numerical  procedure  to  solve  problem  (2),  then  it  is  highly  desirable  for  it  to 
be  expressed  in  the  same  form  as  the  nonlinear  constraint  already  present  in  the  problem.  For  example, 
if  the  computational  procedures  used  arc  tailored  specifically  to  the  constraints  already  present  in  the 
problem,  then  addition  of  a  new  type  of  “nonlinearity”  that  is  not  comparable  with  these  approaches 
may  be  impractical.  The  descriptions  obtained  in  the  literature  for  mixed-integer  second-order  cone 
programming  express  the  convex  hull  of  the  disjunction  in  terms  of  quadratic  sets,  essentially  preserving 
the  second-order  conic  nonlinearity  in  many  practical  cases,  thus  justifying  the  approach. 

Consequently,  consider  (18)  with  certainty  equivalent  set  (17).  We  can  conclude  that  it  is  desirable  that 
its  description  is  itself  represented  in  terms  of  function  F  defined  in  (17).  At  the  same  time,  consider 
supporting  hyperplanes  for  (18).  It  is  easy  to  see  that  for  at  least  some  of  such  hyperplanes,  their  inter¬ 
section  with  the  convex  hull  is  a  straight  line  segment  in  between  =  jzq  and  =  it i .  On  the  other 
hand,  a  boundary  of  a  set  defined  in  terms  of  function  F  does  not  in  general  contain  such  segments,  since 
it  is  nonconic.  Thus,  it  is  reasonable  to  expect  that  such  a  closed-form  description  of  convex  hull  (18) 
cannot  be  expressed  in  terms  of  function  F  alone.  With  this  in  mind,  we  propose  to  concentrate  on  a 
more  modest  goal  of  constructing  supporting  hyperplanes  for  (18),  or  in  other  words,  linear  disjunctive 
cuts. 

Next  we  propose  an  intuitive  idea  for  a  procedure  aimed  at  avoiding  difficulties  associated  with  the 
general  disjunctive  cut  generation  techniques  available  in  the  literature  by  exploiting  specific  struc¬ 
tural  properties  of  (18).  Suppose  that  we  have  selected  a  point  x°  e  1C  such  that  x®  =  jtq,  i.e., 
x°  is  located  on  one  side  of  the  disjunction.  Given  such  a  x°,  find  x1  e  /C  such  that  x ^  =  jt\ 
and  3(^)F(x1)  fl  9(/t)F(x°)  ^  0,  where  subdifferential  9^)  is  taken  with  respect  to  variables  x,-, 

i  ^  k.  A  lineal-  disjunctive  cut  is  then  constructed  as  a  constraint  Y  a,x,  +  fJ>  <  Xo,  where 

i 

(oq . afjt-i.cKjt+i, . . .  ,  a«)T  e  d(k)F(x0)  n  9(£)i7(x1),  while  a k  and  /3  are  selected  in  such  a  way 

that  Y  ai  x?  +  /3  =  Xq  and  Y  ai xj  +  (I  =  x(’.  Geometrically  it  means  that  the  constructed  hyperplane 

i  i 

is  such  that  it  passes  through  both  x°  and  x1  and  is  supporting  to  both  sides  of  the  disjunction  at  these 
points.  Clearly,  convexity  of  /C  implies  that  this  cut  is  valid. 

The  described  procedure  can  be  formulated  as  follows.  Given  x°  el"xl,(  e  {1 no,  rc\  e  Z, 
and  function  r  :  R  h-  1,  find  x1  e  R”  x  M  and  (a,  fi)  el"  xl  such  that 
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£  <XiX?  +  p  =  xH  (19a) 

i  =  l 

£  otix}  +  fi  =  xl  (19b) 

i= 1 

,  X°  =  TTO  (19c) 

x'k  =  7T1  (19d) 

F(x°)  =  xjj  (19e) 

*V)  =  ^  (19f) 

(oil . ak-1,ak+ a„)T  e  9pfc)F(x°)  n  9wF(x1).  (19g) 

n 

Let  us  denote  by  V  the  half  space  valid  for  the  linear  cut  £  a,Xi  +  f  <  xo,  i.e.,  V  :=  {x  e 


R”+ 1  |  £  ctjXi  +  f>  <  x0}.  By  9/C  and  dV  we  will  understand  boundaries  of  these  sets.  In  Propo- 

i=t 

sition  4.5  below  we  will  establish  validity  of  this  approach,  but  first  we  consider  a  few  useful  lemmas. 
Lemma  4.3.  The  following  statements  hold: 

(i)  xl  €  dKL  for  i  =  0,  1; 

(ii)  x'  €  dV  fori  =  0,  1; 

(iii)  ifx^V  and  xk  —  no,  then  x  <£  /C; 

(iv)  ifxfiV  and  xk  —  n\,  then  x  £  1C. 

Proof  Claims  (i)  and  (ii)  follow  immediately  from  (19a)-(19b)  and  (19e)-(19f).  In  order  to  see  that 
(iii)  holds,  note  that  (19)  implies  that  on  the  space  restricted  by  xk  =  no  the  set  dV  is  a  supporting 
hyperplane  for  the  set  9/C,  which  immediately  implies  (iii).  Analogous  observation  holds  for  (iv).  □ 

Lemma  4.4.  The  following  statemets  hold: 

(i)  ifxedV  and  xk  <  no,  then  x  £  int  1C; 

(ii)  ifx  e  dV  and  xk  >  n\,  then  x  £  int  1C. 

Proof  First,  consider  claim  (i).  Suppose  that  the  contrary  holds,  i.e.,  x  e  int  1C.  Then,  there  exists  an 
e  >  0  such  that  y  =  (xo  —  e,  x)  e  /C  and  y  ^  V.  Now  consider  the  segment  connecting  points  y  and 
x1,  i.e.,  the  set  {Ay  +  (1  —  A)xx  |  A  e  [0,  1]}  =:  T.  Since  both  y  e  1C  and  x1  e  /C,  then  T  C  1C.  Since 
no  <  n i  and  xk  <  no,  then  there  exists  z  =  (z.o, 7-)  €  T  such  that  Zk  =  ^o-  At  the  same  time,  z  ^  V  as 
y  ^  V  while  x1  e  dV.  Thus,  by  Lemma  4.3  (iii)  one  has  that  x  ^  1C,  which  contradicts  the  assumption 
above.  Hence,  claim  (i)  holds.  Statement  (ii)  can  be  proved  analogously.  □ 

Proposition  4.5.  Ifx  €  1C  and  x/t  ^  [no,  n{\,  then  x  e  V. 
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n 

Proof.  Assume  the  contrary,  i.e.,  that  x  £  V,  which  means  that  Y  atXj  +  f>  >  xo-  Then,  there  exists 

i=  1 
n 

y  =  Oo  +  s,  x)  €  r>V  (take  s  =  Y  a,Xj  +  f>  —  Xo).  Moreover,  by  definition  y  e  int  1C.  If  <  ttq, 

i= t 

then  this  conclusion  contradicts  Lemma  4.4  (i),  otherwise,  x^  >  jt\  and  the  conclusion  above  contradicts 
Lemma  4.4  (ii).  □ 

This  result  guarantees  that  the  cut  generated  by  (19)  is  feasible  for  (18).  Moreover,  it  is  easy  to  see  that 
for  any  ft  >  ft  and  a  =  a  the  corresponding  cut  is  not  feasible  due  to  Lemma  4.3  (i).  Hence,  system 
(19)  produces  a  tight  cut  in  the  sense  that  it  cannot  be  improved  by  an  affine  transformation. 

Observe  that  x^  e  1”  x  M  and  (a,  /3)  e  M”  x  M  are  the  unknowns  in  the  system  (19).  Given  a  specific 
value  of  x1  e  M”  such  that  d(/c)F(x n  3(^)F(x^1-))  ^  0  it  is  easy  to  determine  the  rest.  Indeed,  x q  is 
uniquely  defined  by  (19f),  vector  (ai , . . . ,  a^-i  >  &k+ 1  >  •  •  • ,  cq,)T  can  be  selected  according  to  (19g),  and 
ak  and  (J>  arc  fixed  by  (19a)  and  (19f).  Thus,  the  most  challenging  step  in  this  procedure  is  the  selection 
of  x1  satisfying  3(£)F(x(°))  n  3 (k)F(x^)  0.  In  the  end  of  this  section  we  will  show  that  such  x1 

can  always  be  found,  but  let  us  first  note  that  this  step  can  be  numerically  cumbersome  since  function 
F  defined  in  (17)  is  only  piecewise  continuously  differentiable.  Consequently,  we  propose  to  employ 
another  approximation  procedure  in  order  to  achieve  this  goal.  Namely,  we  consider  substituting  |/|  ~ 

Yt2  +  e,  and  hence,  defining  F(x)  :=  t>-1^  Y  v(^yj(aj x  +  bj)2  +  Then,  F  is  continuously 

differentiable  and  in  order  to  find  x1  we  need  to  solve  a  system  of  nonlinear  equations  with  a  given  x° 

|^-(xJ)  =  |^-(xdd),  i  =  ,k  —  \,k  +  \ . n.  After  this  system  is  solved,  the  validity  of  the 

found  x1  can  be  verified  directly  by  comparing  d^k)  F(x^)  and  3(^)F(x^^). 

In  order  to  establish  existence  of  vector  x1,  let  us  introduce  some  additional  notation.  Without 

loss  of  generality  we  will  assume  that  k  —  1,  and  let  us  define  a j  =  (a 2 j . c/,(/ )T  for  all  j, 

A  =  (aj ; . . . ;  ajn)  and  x  =  ( ,v 2 . xn)T ,  i.e,  expressions  with  tildes  represent  the  values  restricted  to 

variables  (X2 . xn).  Let  us  also  define  T^(x)  :=  n_1  ^  Pjv(\&J  x  +  bj  +  for  l  —  0,1. 

Proposition  4.6.  Assuming  that  A  is  full  rank,  for  any  x°  there  exists  x1  such  that  dF 1  (x1 )  fl  3  F°  (x°)  f 
0. 

Proof.  Let  us  consider  vector  a  6  ()F()(x 1 ).  By  definition  a  gives  us  a  supporting  hyperplane  to  epi  F° 
at  point  x°.  Let  us  denote  this  hypeiplane  as  P°.  The  full  rank  of  A  guarantees  that  this  hyperplane  is 
non-degenerate,  i.e.,  both  functions  F^  substantially  depend  on  all  variables  x.  In  order  to  show  that  the 
claim  of  the  proposition  holds,  we  need  to  establish  that  there  exists  a  supporting  hyperplane  to  epi  F 1 
which  is  parallel  to  P°. 

First,  let  us  assume  that  there  exists  a  constant  M  such  that  epi(F 1  +M)  C  epi  F°.  Then,  the  hyperplane 
P°  is  valid  for  cpi(  F 1  +  M)  (meaning  that  it  does  not  intersect  with  it).  In  this  case,  since  dom  F1*  = 
1R”-1,  there  exists  a  vertical  translation  of  P°,  which  is  supporting  to  epi(F 1  +  M ).  Clearly,  this 
immediately  implies  the  claim  of  the  proposition. 

Now,  we  will  show  that  such  a  constant  exists.  Let  us  introduce  an  auxiliary  variable  vector  w  e  M"7 
and  functions  Gt{ w)  :=  u-1  ^  Y  Pjv(\wj  +  bj  +  i.e.,  G('( w)  =  F('(x)  if  Wj  =  a^x  for  j  = 

2, . . . ,  m  and  l  —  0,  1.  Clearly,  both  G(  arc  proper  and  convex  and  dom  (Y  —  LRm.  Consider  recession 
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function  G^0+  of  G^  (see,  e.g.,  Rockafellar,  1997  for  details),  (G^0+)(w)  =  lim^^0AG^(A  *w)  = 

limA4,o  At;- 1  (  Pj  v(\X~lWj  +bj  +a\  jiT(  \) ).  Observe  that  since  v~ 1  (0)  =  0  and  n_1  is  nondecreasing 
V  j 

and  concave,  then  v~l(mt)  <  With  this  in  mind, 

(G^0+)(w)  <  hm  Au_l  (^m  max  jt>(|A_1  u;/  +  bj  +  a  i/ | )  j  ^ 

<  lim  m  max  j  |ipy  +  A  (bj  +  a\jjzi)\l  =  m  max{|w;y  |}  <  +oo. 

Aft)  j  '  ' 

Hence,  dom(G/0+)  =  Km,  which  implies  that  G 1  is  Lipschitz  continuous  on  Rm  with  Lipschitz  constant 
L  =  sup{(G*0+)(w)  |  ||  w ||  =  1}  (see,  for  example,  Auslender  and  Teboulle,  2003,  Proposition  2.5.5). 
Further,  note  that  G°(w)  =  G 1  ( w  +  A),  where  A  j  =  ayno  —  a\jn\.  Thus,  |G 1  (w)  —  G°(w)|  = 
| G 1  (w)  —  G*(w  +  A)|  <  L||A||.  Finally,  if  we  set  Wj  =  a Jx,  then  |  F1  (x)  —  F°(x)|  =  | G 1  (w)  — 
G°(w)|  <  M ,  if  M  =■  L ||  A ||  <  +oo,  which  completes  the  proof.  □ 

Finally,  it  is  necessary  to  discuss  practical  selection  of  k ,  no,  tc\  and  x°.  If  the  cut  generation  procedure 
is  implemented  in  a  branch-and-bound  setting,  it  can  be  assumed  that  a  solution  of  a  relaxed  problem 
xrelax  is  known  beforehand.  Hence,  it  is  natural  to  select  k  e  {{1, ...,«}  |  xjjflax  ^  Zj,  no  =  |HelaxJ  and 
n i  =  no  +  1.  Since  the  goal  of  generating  a  valid  inequality  is  to  cutoff  xrelax,  then  it  is  also  natural  to 
pick  x°  according  to  x;°  =  x[elax,  for  i  ^  k,  x°  =  no  and  =  F(x°). 

Before  concluding  this  section,  it  is  worth  noting  that  the  proposed  procedure  does  not  represent  a  general 
way  to  generate  a  split  closure  for  the  feasible  set  (17).  Alternatively,  it  can  be  seen  as  a  quick  and  simple 
numerical  procedure  to  find  a  valid  inequality  that  can  cuts  off  the  current  non-integral  solution. 


5  Numerical  Experiments 

In  this  section  we  will  report  the  results  of  numerical  case  studies  designed  to  evaluate  the  performance 
the  proposed  techniques.  As  it  has  been  discussed  in  the  introduction,  our  main  interest  in  the  prob¬ 
lem  class  considered  in  this  paper  stems  from  risk-averse  approaches  to  stochastic  programming,  and 
hence  we  base  our  numerical  experiments  on  this  application  area.  Next,  we  will  discuss  the  particular 
formulation  used  in  our  study. 


5.1  Model  Formulation 


According  to  discussion  in  Section  2,  a  scenario-based  formulation  for  risk-minimization  problem 
min{p(A(x,  co))  |  x  e  X },  where  p  is  a  certainty  equivalent  measure  of  risk,  X (x,  op )  =  a^x  +  bj, 
and  xeZ"1  x  M”2,  reduces  to 


min  p  + 


1 


1  —  a 


s.  t.  t  >  u 


_1(  £  Pjviwj)) 


Wj  >  aj  x  +  bj  —  rj,  j  =  1 . m 

x  e  Z"1  x  M”2,  w  >  0,  i)e  M. 


(20a) 

(20b) 

(20c) 

(20d) 
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Here,  function  v  is  nondecreasing,  convex  and  such  that  v 


m  x 

Y2  Pj  v  ( Wj )  j  is  convex  in  w.  Some  of  the 
7  =  1 

promising  choices  for  v  have  been  discussed  in  Vinel  and  Krokhmal  (2014b).  Particularly,  v  ( l )  =  [/]+ 
leads  to  the  definition  of  Conditional-Value-at-Risk  (CVaR),  a  popular  risk  measure  in  many  stochastic 
programming  applications  (see,  Rockafellar  and  Uryasev,  2000,  2002  for  more  details).  It  has  also  been 
observed  in  the  literature  (see,  Krokhmal,  2007;  Vinel  and  Krokhmal,  2014b)  that  v(t)  =  [t\+  for  p  >  1 
and  v(t)  —  <?^+  —  1  can  lead  to  some  encouraging  decision  making  performance  in  the  presence  of  the 
so-called  heavily  tailed  distributions  of  risks. 

Computationally,  problem  (20)  with  linear  v ( l )  =  [t]+  leads  to  a  linear  mixed-integer  programming 
problem,  while  v(t )  =  [/R  results  in  a  MIpOCP  problem,  both  of  which  have  been  studied  before  as 
previously  discussed.  Since  the  solution  approaches  proposed  in  the  current  paper  are  targeted  towards 
the  more  general  non-conic  cases,  in  our  numerical  experiments  here  we  concentrate  on  the  case  of 
v(t)  =  e^+  —  1,  which  has  been  referred  to  in  Vinel  and  Krokhmal  (2014b)  as  Log-Exponential  Convex 
Risk  (LogExpCR)  measure. 

We  utilize  financial  portfolio  optimization  model  as  the  basic  decision  making  problem  in  our  study.  It  is 
often  used  as  a  test  model  in  the  risk-averse  stochastic  programming  literature,  and  additionally,  enjoys 
abundance  of  real-life  historic  data  that  can  be  used  in  various  case  studies. 

In  a  standard  risk-reward  portfolio  selection  problem,  a  set  of  n  financial  assets  is  considered.  Then,  the 
loss  is  defined  as  the  negative  portfolio  return,  X(x,  co)  —  — r(o;)Tx,  where  x  stands  for  the  vector  of 
portfolio  weights,  and  r  =  r(o>)  is  the  uncertain  vector  of  assets’  returns.  Consequently,  the  goal  is  to 
select  portfolio  weights  x  in  such  a  way  that  the  risk  associated  with  this  choice,  as  evaluated  by  a  risk 
measure  p,  is  minimized,  while  maintaining  a  certain  predefined  value  of  the  expected  return  (reward): 


min  ]p(— rTx) 

xeR'l  t 


E(rTx)  >  r,  lTx  <  1  J, 


(21) 


where  r  is  the  prescribed  level  of  the  expected  return,  x  €  denotes  the  no- short- selling  requirement, 
and  1  =  (1 . 1)T. 

We  consider  two  types  of  investment  constraints  that  lead  to  mixed-integer  portfolio  optimization  prob¬ 
lems.  Many  floor  trading  systems  mandate  that  assets  can  only  be  bought  in  “lots”  of  shares  (for  instance, 
in  multiples  of  1,000  shares),  which  leads  to  a  lot-buying  constrained  portfolio  optimization  model: 

!t  t  —  t  f 

p(— r  x)  E(r  x)  >  r,  1  x  <  1,  x=  — Diag(rcr)z 

C 


where  L  is  the  size  of  the  lot,  C  is  the  investment  capital  (in  dollars),  and  vector  m  e  IRT  represents 
the  prices  of  assets.  Similarly,  it  may  be  desirable  for  a  portfolio  to  contain  no  more  than  a  prescribed 
number  of  assets,  which  leads  to  cardinality-constrained  portfolio  optimization  model: 


min 

x€R'_j_,  z€{0,l} 


|p(— rTx)  E(rTx)  >  r,  lTx  <  1,  x  <  z,  lTz<gJ, 


(23) 


where  Q  is  the  maximum  number  of  assets  in  the  portfolio. 

If  historical  data  (scenarios)  for  the  assets’  returns  is  known,  then  problem  (21)  with  a  LogExpCR  risk 
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measure  can  be  formulated  in  the  form  of  (20): 

min  r]  +  (1  —  a.)~lt 

(m 

E  pjeWj 

j= i 

w  +  (ri, . . . , r^)Tx  +  \r\  >  0,  (24) 

xT(EP/r./)  >f, 

j 

lTx  <  1,  x  >  0,  w  >  0. 

Problems  (22)  and  (23)  can  be  easily  reformulated  accordingly. 

We  used  historical  data  for  n  assets  chosen  at  random  from  the  stocks  traded  on  NYSE,  such  that  histor¬ 
ical  prices  are  available  for  5100  consecutive  trading  periods  preceding  December,  2012.  Returns  over 
m  consequent  10-day  periods  starting  at  a  (common)  randomized  date  were  used  to  construct  the  set 
of  m  equiprobable  scenarios  for  the  stochastic  vector  r.  The  values  of  parameters  were  set  as  follows: 
L  =  1000,  C  =  100,000,  Q  =  5,  a  —  0.9,  r  =  0.005.  Historical  values  of  the  assets’  returns  have  been 
scaled  by  multiplying  parameter  MUL,  i.e.,  r;/  =  MLJL  777'-'^0  m,J  ,  where  mtj  is  the  close  price  of 
asset  i  at  day  j .  Note  that  since  LogExpCR  measure  is  not  positively  homogeneous  the  value  of  MUL 
has  an  impact  on  the  decisions  preferred  (see  Vinel  and  Krokhmal,  2014b  for  more  details). 


5.2  Preliminary  Study:  Polyhedral  Approximation  Method  for  Convex  Portfolio  Opti¬ 
mization 


As  a  preliminary  computational  study,  we  performed  experiments  with  the  convex  formulation  (21).  Our 
goal  here  was  to  test  the  performance  of  the  proposed  cutting  plane  approximation  procedure  compared 
to  the  existing  exact  approaches.  Namely,  we  implemented  the  iterative  algorithm  presented  in  Section 
3.2  using  CPLEX  LP  solver  for  iteratively  resolving  the  master  problem  and  compared  it  against  MOSEK 
NLP  interior  point  solver. 


In  addition,  we  also  implemented  a  simpler  version  of  the  iterative  approximation  approach,  which  is 
applicable  for  the  case  of  exponential  constraints.  This  scheme  follows  the  same  iterative  master  cutting 
plane  approach  with  the  only  difference  being  the  structure  of  the  tangent  planes  utilized.  Observe  that 


(m 

E  p./eWj 

j= t 


can  be 


equivalently  expressed  as 


E PjHj  <  1.  ewJ  1  <  £/,  ij  >  o,  7  =  1,..., m. 


Consequently,  the  cutting  planes  (which  are  actually  lines)  in  this  case  can  be  constructed  as  tangent 
to  the  nonlinear  constraint  of  the  form  ex  <  y.  In  the  remainder  of  this  section  we  will  refer  to  this 
approach  as  simple  approximation  procedure.  Our  aim  here  is  to  verify  whether  the  lifting  procedure 
presented  in  Section  3.2  is  superior  to  this  more  straightforward  approach. 

Results  of  this  study  are  summarized  in  Table  1.  For  each  combination  of  number  of  assets  n,  number 
of  scenarios  m  and  the  value  of  scaling  parameter  MUL  we  generated  20  random  instances  based  on 
the  historical  data  as  described  above.  The  columns  “MOSEK”,  “Simple”  and  “CP”  correspond  to  the 
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average  solution  time  used  by  MOSEK  NLP  solver,  iterative  procedure  based  on  simple  approximation 
and  lifted  cutting  planes  procedure  presented  in  Section  3.2  respectively.  Column  “MOSEK-CP”  reports 
maximum  absolute  difference  in  portfolio  values  obtained  with  MOSEK  and  lifted  cutting  planes  pro¬ 
cedure,  while  column  “MOSEK-SA”  contains  the  maximum  absolute  difference  between  MOSEK  and 
the  simple  approximation.  The  approximation  accuracy  as  well  as  CPLEX  and  MOSEK  feasibility  and 
optimality  tolerances  were  set  to  I O-6. 

Observe  that  for  all  instances  the  lifted  approach  outperforms  the  simple  approximation  both  in  terms  of 
solution  time  and  accuracy,  which  confirms  the  theoretical  advantages  of  the  lifting  procedure  discussed 
in  Section  3.2.  The  lifted  cutting  plane  approach  returns  portfolios  that  arc  mostly  within  the  prescribed 
tolerance  from  the  exact  solutions  due  to  MOSEK.  Note  that  since  the  values  reported  arc  absolute 
differences,  they  naturally  increase  with  the  increase  in  the  value  of  parameter  MUL.  Moreover,  the 
approximation  procedure  finds  these  solutions  significantly  faster  than  MOSEK  for  the  instances  with 
MUL  =  10000  and  large-scale  instances  with  MUL  =  1  (recall  that  a  change  in  the  scaling  parameter 
in  fact  changes  the  optimal  solution,  i.e.,  this  scaling  is  more  than  a  simple  computational  convenience). 
Additionally,  MOSEK  could  not  find  an  optimal  solution  returning  an  infinite  portfolio  value  for  some 
instances  with  MUL  =  10000,  whereby  the  polyhedral  approximation-based  procedure  may  be  more 
stable  numerically. 

The  goal  of  this  preliminary  study  was  to  check  whether  the  essentially  exact  algorithm  based  on  the 
approximation  procedure  can  be  competitive  against  the  state-of-the-art  NLP  solvers,  and  whether  the 
introduction  of  lifting  leads  to  computational  improvement.  We  clearly  observed  that  at  least  for  the  class 
of  instances  considered  here,  the  cutting-plane  method  performs  favorably  compared  to  MOSEK  NLP 
solver,  outperforming  it  significantly  for  some  instances.  At  the  same  time,  the  cutting  plane  procedure 
based  on  simple  approximation  scheme  does  not  possess  any  favorable  computational  properties. 

These  results,  obtained  on  convex  problems,  arc  also  of  significant  importance  in  the  context  of  branch- 
and-bound  process  for  the  corresponding  mixed-integer  programming  models.  It  has  been  demonstrated 
in  the  literature  that  in  the  case  of  second-order  cone  programming  a  branch-and-bound  method  based 
on  polyhedral  approximation  procedure  can  still  outperform  conventional  approaches  even  while  the  ap¬ 
proximation  scheme  itself  may  not  result  in  computational  improvement  for  the  convex  model  (Glineur, 
2000;  Vielma  et  al.,  2008).  Our  previous  experience  with  solving  /(-order  cone  programming  problems 
suggested  a  similar  conclusion.  In  view  of  this,  the  results  of  this  preliminary  study  allow  us  to  confirm 
that  the  proposed  approach  to  the  mixed-integer  model  is  promising.  In  the  next  subsection  we  will  study 
this  case  directly. 

5.3  Discrete  Portfolio  Optimization 

CPLEX  MIP  and  LP  solvers  have  been  used  to  implement  the  branch-and-bound  method  described  in 
Section  3.  Namely,  callback  routines  have  been  employed  in  order  to  add  approximating  hyperplanes 
at  each  node  of  the  solution  tree,  while  a  goal  framework  was  utilized  to  guide  branching.  The  exact 
algorithm  based  on  approximation  scheme  presented  in  Section  3  has  been  used  to  verify  incumbent 
solutions.  The  two  families  of  valid  inequalities  have  been  employed  by  means  of  CPLEX  callback 
routines.  In  our  experiments  we  only  added  cuts  at  the  root  node  of  the  branch-and-bound  tree.  A  quasi- 
Newton’s  method  has  been  used  to  solve  the  underlying  nonlinear  systems  of  equations  when  finding 
lineal-  split  cuts  presented  in  Section  4.3. 

Two  sets  of  experiments  have  been  conducted  to  estimate  the  effects  of  the  techniques  proposed  in  the 
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MUL 

=  1 

MUL 

=  10000 

n 

m 

MOSEK 

Simple 

CP 

MOSEK-CP 

MOSEK-SA 

MOSEK 

Simple 

CP 

MOSEK-CP 

MOSEK-SA 

20 

too 

0.11 

0.13 

0.07 

3.00E-06 

1.64E-04 

0.11 

0.09 

0.05 

O.OOE+OO 

6.00E-04 

200 

0.07 

0.32 

0.08 

2.00E-07 

1.14E-05 

0.21 

0.25 

0.08 

1.00E-04 

1.50E-03 

500 

0.14 

1.72 

0.20 

1.00E-06 

3.30E-05 

0.50 

0.92 

0.17 

O.OOE+OO 

1.20E-03 

1000 

0.28 

7.03 

0.55 

1.00E-06 

3.10E-05 

1.69 

2.89 

0.30 

O.OOE+OO 

7.00E-03 

2000 

0.66 

29.26 

1.57 

2.00E-06 

3.40E-05 

2.44 

11.01 

0.58 

O.OOE+OO 

6.00E-03 

5000 

5.83 
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5.54 

1.00E-06 

1.71E-03 

8.74 
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1.41 

O.OOE+OO 

4.00E-02 

50 
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0.09 
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O.OOE+OO 
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0.24 
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O.OOE+OO 
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*** 
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14.99 
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5000 

21.41 

700.39 

59.05 

1.00E-06 

5.17E-04 

104.00 

74.34 

3.90 

*** 

*** 

500 

too 

0.16 

0.14 

0.05 

1.00E-06 

6.32E-04 

0.42 

0.13 

0.07 

O.OOE+OO 

1.00E-03 

200 

0.75 

0.58 

0.12 

3.00E-06 

8.40E-05 

0.52 

0.35 

0.13 

4.00E-03 

4.00E-03 

500 

3.86 

5.78 

0.48 

3.00E-07 

1.40E-06 

3.23 

1.96 

0.51 

1.00E-04 

1.20E-03 

1000 

21.02 

42.64 

2.00 

4.00E-07 

-2.00E-07 

14.62 

7.31 

1.68 

1.00E-04 

5.40E-03 

2000 

87.45 

308.63 

4.59 

2.00E-07 

1.30E-06 

332.82 

27.64 

7.31 

*** 

*** 

5000 

143.58 

1312.82 

27.77 

1.00E-06 

2.75E-04 

564.27 

76.05 

9.01 

*** 

*** 

Table  1:  Performance  of  the  solvers  for  convex  portfolio  optimization  problems.  Columns  MOSEK,  Simple  and 
CP  represent  average  solution  time  (in  seconds)  over  20  instances  of  MOSEK  NLP  solver,  cutting  planes  method 
based  on  simple  approximation  scheme  and  cutting  planes  method  with  a  lifted  scheme.  The  maximum  absolute 
difference  in  the  portfolio  value  is  reported  in  columns  MOSEK-SA  (comparing  solution  due  to  MOSEK  with  the 
one  due  to  simple  approximation)  and  MOSEK-CP  (MOSEK  and  lifted  cutting  planes  method),  n  is  the  number  of 
assets,  m  is  the  number  of  scenarios,  MUL  is  the  scaling  parameter.  “***”  corresponds  to  the  instances  for  which 
MOSEK  returned  an  infinite  portfolio  value. 
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n 

5 

10 

20 

m 

10 

50 

100 

10 

50 

100 

10 

50 

100 

CGBNB 

0.93 

0.87 

0.34 

0.80 

1.46 

1.62 

1.51 

2.70 

3.99 

AIMMS 

49.17 

67.10 

73.55 

104.43 

151.35 

221.19 

195.29 

618.61 

7710.85 

Table  2:  Running  time  of  AIMMS-AOA  and  the  proposed  implementation  of  the  branch-and-bound  method  in 
lot-buying  constrained  portfolio  optimization.  Results  averaged  over  20  instances. 


n 

10 

20 

50 

m 

500 

1000 

2000 

500 

1000 

2000 

500 

1000 

2000 

CG-BNB 

0.74 

1.72 

5.10 

12.03 

22.57 

50.64 

108.67 

240.38 

263.57 

AIMMS 

11.65 

35.90 

96.88 

294.74 

459.21 

639.43 

863.50 

1489.65 

2071.98 

Table  3:  Running  time  of  AIMMS-AOA  and  the  proposed  implementation  of  the  branch-and-bound  method  in 
cardinality  constrained  portfolio  optimization.  Results  averaged  over  20  instances. 


paper.  First,  the  implementation  of  the  branch-and-bound  method  from  Section  3  has  been  compared 
against  AIMMS  AOA  implementation.  The  results  for  lot-buying  and  cardinality  constrained  problems 
are  summarized  in  Tables  2  and  3,  respectively.  Observe  that  our  custom  implementation  significantly 
outperforms  AOA  method  for  all  choices  of  the  parameters  n  and  m.  It  is  worth  noting  that,  as  it  is  stated 
in  AIMMS  manual,  their  implementation  is  much  more  efficient  for  binary  variables,  which  is  the  case 
in  our  cardinality  constrained  problems.  This  observation  explains  the  fact  that  in  our  experiments  the 
improvement  over  AOA  method  has  been  less  significant  for  this  class  of  problems.  Overall,  we  can 
conclude  that  these  experiments  confirm  that  the  branch-and-bound  approach  presented  here  can  be  seen 
as  a  viable  strategy  for  solving  the  considered  class  of  MINLP  problems. 

In  the  second  stage  of  our  case  study,  we  aimed  at  evaluating  the  effect  that  valid  inequalities  defined 
in  Section  4  can  play  in  solving  problems  (22)  and  (23).  Results  of  this  case  study  are  summarized  in 
Table  4  and  5.  Note  that  for  each  problem  size  20  instances  were  generated  and  solved  with  a  1  hour 
time  limit.  We  report  the  number  of  instances  solved  within  the  time  limit,  solution  time  and  number  of 
nodes  in  the  branch-and-bound  tree  averaged  over  the  instances  that  have  been  solved  in  1  hour  by  all 
three  approaches,  and  the  average  integrality  gap  among  instances  not  solved  to  optimality. 

We  can  observe  that  in  both  of  the  models  the  usage  of  the  proposed  valid  inequalities  leads  to  improved 
solution  performance,  especially  for  larger  problems  sizes.  It  is,  in  our  view,  particularly  important  to 
note  that  we  are  able  to  solve  more  problem  instances  within  the  time  limit,  as  well  as  significantly 
reduce  the  integrality  gap.  It  can  also  be  noted  that  while  in  the  case  of  lot-buying  constrained  problems 
the  lifted  cuts  presented  in  Section  4.2  exhibit  the  best  overall  performance,  in  cardinality  constrained 
optimization,  this  approach  does  not  provide  any  improvement  over  pure  branch-and-bound. 


6  Conclusions 

In  this  paper  we  discussed  solution  approaches  for  a  class  of  mixed-integer  nonlinear  programming 
problems,  which  arise  from  some  recent  developments  in  risk-averse  stochastic  optimization.  In  our 
study,  we  revisit  some  of  the  methods  that  have  been  previously  proposed  in  the  literature,  and  show  that 
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n 

m 

number  solved 

running  time 

nodes  in  solution  tree 

gap 

after  time  limit 

lifted 

split 

no  cuts 

lifted 

split 

no  cuts 

lifted 

split 

no  cuts 

nonlin 

split 

no  cuts 

50 

500 

20 

20 

20 

11.57 

9.92 

11.01 

5864.50 

4309.05 

5905.65 

— 

— 

— 

1000 

20 

20 

20 

41.07 

38.45 

28.57 

9307.70 

8265.75 

6453.65 

— 

— 

— 

2000 

20 

20 

20 

68.12 

68.11 

138.37 

7411.30 

6559.15 

13016.30 

— 

5000 

19 

19 

19 

695.14 

622.18 

581.49 

18903.58 

16145.32 

15368.53 

2.41% 

5.19% 

6.25% 

100 

500 

19 

14 

14 

400.22 

436.02 

467.32 

129745.46 

173480.42 

190997.69 

— 

— 

— 

1000 

15 

13 

13 

456.84 

502.90 

1300.26 

77967.64 

86555.38 

221685.91 

2.68% 

14.02% 

6.06% 

2000 

19 

20 

15 

179.06 

337.18 

223.93 

11908.73 

24955.93 

16974.87 

3.01% 

— 

5.46% 

5000 

19 

20 

18 

673.90 

670.20 

731.66 

16101.59 

13831.22 

17026.82 

— 

— 

— 

200 

500 

6 

1 

0 

— 

— 

— 

— 

— 

— 

87.92% 

46.49% 

191.83% 

1000 

0 

0 

0 

— 

16.31% 

24.34% 

22.99% 

2000 

$ 

6 

5 

498.57 

787.33 

2153. 1 1 

25654.00 

35918.50 

138485.00 

8.33% 

3.84% 

6.50% 

5000 

17 

12 

12 

1408.58 

1804.24 

1539.48 

19271.44 

20442.11 

22581.56 

— 

— 

— 

500 

500 

0 

0 

0 

— 

— 

— 

— 

— 

— 

128.91% 

128.89% 

200.04% 

1000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

109.42% 

1 14.03% 

116.02% 

2000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

29.27% 

29.34% 

28.95% 

5000 

2 

1 

0 

— 

— 

— 

— 

— 

— 

124.45% 

113.67% 

213.15% 

1000 

500 

0 

0 

0 

— 

— 

— 

— 

— 

— 

97.01% 

98.57% 

106.20% 

1000 

0 

0 

0 

— 

— 

— 

— 

— 

227.93% 

227.73% 

316.26% 

2000 

0 

0 

0 

— 

— 

— 

— 

— 

54.65% 

55.90% 

65.86% 

5000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

111.06% 

214.31% 

219.85% 

Table  4:  Performance  of  two  valid  inequality  families  in  lot-buying  constrained  portfolio  optimization.  The  rows 
refer  to:  no  cuts  -  pure  branch-and-bound  presented  in  Section  3,  lifted  -  lifted  cuts  from  Section  4.2,  split  - 
disjunctive  cuts  introduced  in  Section  4.3.  Results  averaged  over  20  instances.  Running  time  and  nodes  in  solution 
tree  columns  reflect  only  instances  solved  within  1  hour  time  limit  by  all  three  approaches.  Similarly  gap  after 
time  limit  corresponds  to  instances  for  which  no  optimal  solution  was  found  within  the  time  limit  for  each  of  the 
methods. 


n 

m 

number  solved 

running  time 

nodes  in  solution  tree 

gap 

after  time  limit 

lifted 

split 

no  cuts  | 

lifted 

split 

no  cuts 

lifted 

split 

no  cuts  | 

nonlin 

split 

no  cuts 

nonlin 

split 

no  cuts 

nonlin 

split 

no  cuts 

nonlin 

split 

no  cuts 

nonlin 

split 

no  cuts 

50 

500 

20 

20 

20 

108.84 

122.91 

108.67 

25574.20 

26636.55 

25574.20 

— 

— 

— 

1000 

20 

20 

20 

240.62 

252.45 

240.38 

19634.00 

19239.50 

19634.00 

— 

— 

— 

2000 

20 

20 

20 

263.00 

288.33 

263.57 

7651.90 

7506.10 

7651.90 

— 

— 

— 

5000 

20 

20 

20 

152.99 

76.31 

151.91 

1274.30 

994.70 

1274.30 

— 

— 

— 

100 

500 

6 

7 

6 

2001.51 

1795.24 

1998.73 

293837.33 

111602.00 

293837.33 

23.63% 

20.40% 

23.62% 

1000 

0 

3 

0 

29.48% 

28.22% 

29.36% 

2000 

3 

5 

3 

2770.52 

2440.59 

2796.48 

5400 8. 00 

42317.75 

54008.00 

13.08% 

11.84% 

13.07% 

5000 

18 

19 

18 

1043.44 

991.63 

1047.82 

7770.28 

6734.63 

7770.28 

4.63% 

4.60% 

4.61% 

200 

500 

0 

1 

0 

— 

— 

— 

— 

— 

85.93% 

74.56% 

85.78% 

1000 

0 

0 

0 

— 

— 

— 

— 

— 

71.87% 

52.10% 

71.86% 

2000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

37.56% 

17.68% 

37.56% 

5000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

8.87% 

8.82% 

8.87% 

500 

500 

0 

1 

0 

— 

— 

— 

— 

— 

— 

178.71% 

79.19% 

178.56% 

1000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

126.57% 

26.28% 

126.58% 

2000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

67.29% 

37.03% 

67.30% 

5000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

21.63% 

13.15% 

21.63% 

1000 

500 

0 

0 

0 

— 

— 

— 

— 

— 

— 

223.31% 

123.95% 

223.31% 

1000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

163.56% 

65.14% 

163.57% 

2000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

92.95% 

73.52% 

92.96% 

5000 

0 

0 

0 

— 

— 

— 

— 

— 

— 

219.35% 

124.00% 

219.36% 

Table  5:  Performance  of  two  valid  inequality  families  in  cardinality  constrained  portfolio  optimization.  The  rows 
refer  to:  no  cuts  -  pure  branch-and-bound  presented  in  Section  3,  lifted  -  lifted  cuts  from  Section  4.2,  split  - 
disjunctive  cuts  introduced  in  Section  4.3.  Results  averaged  over  20  instances.  Running  time  and  nodes  in  solution 
tree  columns  reflect  only  instances  solved  within  1  hour  time  limit  by  all  three  approaches.  Similarly  gap  after 
time  limit  corresponds  to  instances  for  which  no  optimal  solution  was  found  within  the  time  limit  for  each  of  the 
methods. 
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these  approaches  can  be  naturally  generalized  to  the  MINLP  problems  in  question.  In  addition,  we  also 
propose  a  new  simple  procedure  for  generating  disjunctive  cuts.  The  conducted  numerical  experiments 
produce  promising  results. 
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Abstract 

We  study  a  framework  for  constructing  coherent  and  convex  measures  of  risk  that  is  inspired  by 
infimal  convolution  operator,  and  which  is  shown  to  constitute  a  new  general  representation  of  these 
classes  of  risk  functions.  We  then  discuss  how  this  scheme  may  be  effectively  applied  to  obtain  a  class 
of  certainty  equivalent  measures  of  risk  that  can  directly  incorporate  preferences  of  a  rational  decision 
maker  as  expressed  by  a  utility  function.  This  approach  is  consequently  employed  to  introduce  a  new 
family  of  measures,  the  log-exponential  convex  measures  of  risk.  Conducted  numerical  experiments 
show  that  this  family  can  be  a  useful  tool  for  modeling  of  risk-averse  preferences  in  decision  making 
problems  with  heavy-tailed  distributions  of  uncertain  parameters. 

Keywords:  Coherent  risk  measures,  convex  risk  measures,  stochastic  optimization,  risk-averse  pref¬ 
erences,  utility  theory,  certainty  equivalent,  stochastic  dominance,  log-exponential  convex  measures 
of  risk 


1  Introduction 

Informally,  a  decision  making  problem  under  uncertainties  can  be  stated  as  the  problem  of  selecting  a 
decision  x  e  C  Cl",  given  that  the  cost  X  of  this  decision  depends  not  only  on  x,  but  also  on  a  random 
event  co  e  £2:  X  =  X(x,co).  A  principal  modeling  challenge  that  one  faces  in  this  setting  is  to  select 
an  appropriate  ordering  of  random  outcomes  X,  or,  in  other  words,  define  a  way  to  choose  one  uncertain 
outcome,  X \  =  X(X].a>),  over  another,  X2  —  X(x2,co).  A  fundamental  contribution  in  this  context 
is  represented  by  the  expected  utility  theory  of  von  Neumann  and  Morgenstern  (1944),  which  argues 
that  if  the  preferences  of  a  decision  maker  are  rational ,  i.e.,  they  satisfy  a  specific  system  of  properties 
(axioms),  then  there  exists  a  utility  function  u  :  M  m>-  M,  such  that  a  decision  under  uncertainty  is  optimal 
if  it  maximizes  the  expected  utility  of  the  payoff.  Equivalently,  the  random  elements  representing  payoffs 
under  uncertainty  can  be  ordered  based  on  the  corresponding  values  of  expected  utility  of  these  payoffs. 
Closely  connected  to  the  expected  utility  theory  is  the  subject  of  stochastic  orderings  (see  Levy,  1998), 
and  particularly  stochastic  dominance  relations,  which  have  found  applications  in  economics,  decision 
theory,  game  theory,  and  so  on. 

An  alternative  approach  to  introducing  preference  relations  over  random  outcomes  X(x,  co),  which  has 
traditionally  been  employed  in  optimization  and  operations  research  literature,  and  which  is  followed 
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in  the  present  work,  is  to  introduce  a  function  p  :  X  m>-  M,  where  X  is  an  appropriately  defined  space 
containing  X,  such  that  X\  is  preferred  to  X2  whenever  p(X  1)  <  p(X 2).  The  decision  making  problem 
in  the  presence  of  uncertainties  can  then  be  expressed  as  a  mathematical  program 

min{p(V)  :  X  =  X(x,  a>)  e  X,  x  €  C},  (1) 

where  function  p  is  usually  referred  to  as  a  risk  measure.  In  stochastic  programming  literature,  the  ob¬ 
jective  of  a  minimization  problem  like  (1)  has  traditionally  been  chosen  in  the  form  of  the  expected  cost, 
p{X)  —  EX  (Prekopa,  1995;  Birge  and  Louveaux,  1997),  which  is  commonly  regarded  as  a  represen¬ 
tation  of  risk-neutral  preferences.  In  the  finance  domain,  the  pioneering  work  of  Markowitz  (1952)  has 
introduced  a  risk-reward  paradigm  for  decision  making  under  uncertainty,  and  variance  was  proposed  as 
a  measure  of  risk,  p(X )  =  o2(X).  Since  then,  the  problem  of  devising  risk  criteria  suitable  for  quan¬ 
tification  of  specific  risk-averse  preferences  has  received  significant  attention  (see  a  survey  in  Krokhmal 
et  ah,  2011).  It  was  noticed,  however,  that  “ad-hoc”  construction  of  p  may  yield  risk  functionals  that, 
while  serving  well  in  a  specific  application,  arc  flawed  in  a  general  methodological  sense.  Artzner  et  al. 
(1999)  suggested  an  axiomatic  approach,  similar  to  that  of  von  Neumann  and  Morgenstern  (1944),  to 
defining  a  well-behaved  risk  measure  p  in  (1),  and  introduced  the  concept  of  coherent  measures  of  risk. 
Subsequently,  a  range  of  variations  and  extensions  of  the  axiomatic  framework  for  designing  risk  func¬ 
tionals  have  been  proposed  in  the  literature,  such  as  convex  and  spectral  measures  of  risk  (Follmer  and 
Schied,  2004;  Acerbi,  2002),  deviation  measures  (Rockafellar  et  ah,  2006),  and  so  on,  see  an  overview 
in  Krokhmal  et  al.  (2011)  and  Rockafellar  and  Uryasev  (2013).  Since  many  classes  of  axiomatically  de¬ 
fined  risk  measures  represent  risk  preferences  that  arc  not  fully  compatible  with  the  rational  risk-averse 
preferences  of  utility  theory,  of  additional  interest  arc  risk  measures  that  possess  such  a  compatibility  in 
a  certain  sense. 

In  this  paper  we  propose  a  new  representation  for  the  classes  of  coherent  and  convex  measures  of  risk, 
which  builds  upon  a  previous  work  of  Krokhmal  (2007).  This  representation  is  then  used  to  introduce 
a  class  of  coherent  or  convex  measures  of  risk  that  can  directly  incorporate  rational  risk  preferences  as 
prescribed  by  the  corresponding  utility  function,  through  the  concept  of  certainty  equivalent.  This  class 
of  certainty  equivalent  measures  of  risk  contains  some  of  the  existing  risk  measures,  such  as  the  popular 
Conditional  Value-at-Risk  (Rockafellar  and  Uryasev  (2000,  2002))  as  special  cases.  As  an  application 
of  the  general  approach,  we  introduce  a  two-parameter  family  of  log-exponential  convex  risk  measures, 
which  quantify  risk  by  emphasizing  extreme  losses  in  the  tail  of  the  loss  distribution.  Two  case  studies 
illustrate  the  practical  merits  of  the  log-exponential  risk  measures;  in  particular,  it  is  shown  that  these 
nonlinear  measures  of  risk  can  be  preferable  to  more  traditional  measures,  such  as  Conditional  Value-at- 
Risk,  if  the  loss  distribution  is  heavy-tailed  and  contains  catastrophic  losses. 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  2.1  we  briefly  discuss  the  classes  of  coherent  and 
convex  measures  of  risk  as  well  as  some  of  their  properties.  Section  2.2  establishes  that  the  constructive 
formula  of  Krokhmal  (2007)  does  actually  constitute  a  representation  for  coherent  risk  measures  and 
can  be  generalized  to  the  case  of  convex  measures  of  risk.  Using  this  representation,  in  Section  2.3  we 
introduce  a  class  of  coherent  or  convex  measures  of  risk  that  are  based  on  certainty  equivalents  of  some 
utility  functions.  In  Section  2.4  we  further  study  some  of  the  properties  of  this  class  of  risk  measures. 
Finally,  Section  3  discusses  the  log-exponential  convex  measures  of  risk,  and  illustrates  their  properties 
with  two  case  studies. 
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Risk  Measures  Based  on  Infimal  Convolution 


2.1  Coherent  and  Convex  Measures  of  Risk 

Consider  a  random  outcome  A  e  A  defined  on  an  appropriate  probability  space  (£2 ,  F.  P),  where  X  is 
a  linear-  space  of  ^-measurable  functions  X  :  £2  m<-  M.  A  function  p  :  X  m>-  M  =  M  U  {+oo}  is  said  to 
be  a  convex  measure  of  risk  if  it  satisfies  the  following  axioms: 

(AO)  lower  semicontinuity  ( l.s.c.) ; 

(Al)  monotonicity.  p(A)  <  p(Y )  for  all  A  <  F; 

(A2)  convexity :  p( XX  +  (1  -  A)F)  <  A p(X)  +  (1  -  A)p(F),  A  e  [0,  1]; 

(A3)  translation  invariance'.  p(A  +  a)  —  p(A)  +  a,  a  e  M. 

Similarly,  function  p  :  X  —>■  M  is  said  to  be  a  coherent  measure  of  risk  if  it  satisfies  (A0)-(A3),  and, 
additionally, 

(A4)  positive  homogeneity.  p( XX)  =  A p(A),  A  >  0. 

Remark  2.1.  We  assume  that  the  space  X  is  endowed  with  necessary  properties  so  that  the  corresponding 
risk  measures  are  well  defined.  Specifically,  A  is  a  space  of  integrable  functions,  E|A|  <  +oo,  and  is 
equipped  with  an  appropriate  topology,  which  is  assumed  to  be  the  topology  induced  by  convergence 
in  probability,  unless  stated  otherwise.  Also,  it  is  assumed  throughout  the  paper  that  all  considered 
functions  are  proper  (recall  that  function  /  :  X  h>-  R  is  proper  if  f(X)  >  — oo  for  all  X  e  X,  and 
dom  /  =  {A  e  A  |  /(A)  <  +oo}  ^  0). 

Remark  2.2.  In  this  work  we  adopt  the  traditional  viewpoint  of  engineering  literature  that  a  random 
quantity  A  represents  a  cost  or  a  loss,  in  the  sense  that  smaller  realizations  of  A  are  preferred.  In  eco¬ 
nomics  literature  it  is  customary  to  consider  A  as  wealth  or  payoff  variable,  whose  larger  realizations  are 
desirable.  In  most  cases,  these  two  approaches  can  be  reconciled  by  inverting  the  sign  of  A,  which  may 
require  some  modifications  to  the  properties  discussed  above.  For  example,  the  translation  invariance 
axiom  (A3)  will  have  the  form  p( X  +  a)  =  p( A)  —  a  in  the  case  when  A  is  a  payoff  function. 

Remark  2.3.  Without  loss  of  generality,  we  also  assume  that  a  convex  measure  of  risk  satisfies  nor¬ 
malization  property:  p(0)  =  0  (observe  that  coherent  measures  necessarily  satisfy  this  property).  First, 
such  a  normalization  requirement  is  natural  from  methodological  and  practical  viewpoints,  since  there  is 
usually  no  risk  associated  with  zero  costs  or  losses.  Second,  due  to  translation  invariance  any  convex  p 
can  be  normalized  by  setting  p( A)  =  p(A)  —  p(0). 

Remark  2.4.  It  is  worth  noting  that  normalized  convex  measures  of  risk  satisfy  the  so-called  subhomo¬ 
geneity  property: 

(A47)  subhomogeneity:  p(AA)  <  Ap(A)  for  A  e  (0,  1)  and  p(AA)  >  Ap(A)  for  A  >  1. 

Indeed,  in  order  to  see  that  the  first  inequality  in  (A47)  holds,  observe  that  Ap(A)  =  Ap(A)  +  (1  — 
A)p(0)  >  p(AA  +  (1  —  A)0)  =  p(AA)  for  A  e  (0,  1).  Similarly,  if  A  >  1,  then  jp(XX)  =  ^p(AA)  + 
(l  —  x)p(0)  —  P(A). 
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Artzner  et  al.  (1999)  and  Delbaen  (2002)  have  proposed  a  general  representation  for  the  class  of  co¬ 
herent  measures  by  showing  that  a  mapping  p  :  X  m<-  M  is  a  coherent  risk  measure  if  and  only  if 
p(X )  =  sup^egEgX  where  Q  is  a  closed  convex  subset  of  P -absolutely  continuous  probability 
measures.  Follmer  and  Schied  (2002)  have  generalized  this  result  to  convex  measures  of  risk.  Since 
then,  other  representations  have  been  proposed,  see  Kusuoka  (2001,  2012);  Frittelli  and  Rosazza  Gianin 
(2005);  Dana  (2005);  Acerbi  (2002).  For  example,  Acerbi  (2002)  has  suggested  a  spectral  representa¬ 
tion:  p(X)  —  f(j  VaR;  (A)i,//(A)dA,  where  \j/  e  A1  ([0.  1]).  While  many  of  these  results  led  to  important 
theoretical  insights  and  methodological  conclusions,  relatively  few  of  them  provided  practical  ways  for 
construction  of  new  risk  measures  in  accordance  with  specified  risk  preferences,  which  are  also  con¬ 
ducive  to  implementation  in  mathematical  programming  problems.  Below  we  discuss  a  representation 
that  may  be  better  suited  for  this  purpose. 

2.2  An  Infimal  Convolution  Representation  for  Coherent  and  Convex  Measures  of  Risk 

An  approach  to  constructing  coherent  measures  of  risk  that  was  based  on  the  operation  of  infimal  con¬ 
volution  was  proposed  in  Krokhmal  (2007).  Given  a  function  cp :  X  i->  R,  consider  a  risk  measure  p, 
which  we  will  call  a  convolution-based  measure  of  risk,  in  the  form 

p(X)  =  inf  rj  +  (p(X  —  rj).  (2) 

v 

Then,  the  following  claim  has  been  shown  to  hold. 

Proposition  2.1  (Krokhmal,  2007,  Theorem  1).  Suppose  that  function  (p  satisfies  axioms  (A0)-(A2)  and 
(A4),  and ,  additionally,  is  such  that  fi(t])  >  t]  for  all  constant  r]  0.  Then  the  infimal  convolution  in 
(2)  is  a  proper  coherent  measure  of  risk.  Moreover,  the  infimwn  in  (2)  is  attained  for  all  X,  and  can  be 
replaced  with  a  minimization  operator. 

In  this  section  we  show  that  this  approach  can  be  substantially  generalized,  which  leads  us  to  formulate 
Theorem  2.5  below.  Before  moving  to  this  general  result,  we  establish  a  few  subsidiary  lemmas.  First,  we 
demonstrate  that  expression  (2)  is  a  representation,  i.e.,  any  coherent  measure  of  risk  can  be  expressed 
in  the  form  of  (2). 

Lemma  2.2.  Let  p  be  a  coherent  measure  of  risk.  Then,  there  exists  a  proper  function  <p  :  X  K 
that  satisfies  axioms  (A0)-(A2)  and  (A4),  (p{rj)  >  tj  for  all  constant  t]  ^  0,  and  is  such  that  p(X)  = 
min ,7  rj  +  cp(X  -  t]). 

Proof.  For  a  given  proper  and  coherent  p  consider  (pp ( X )  =  2[p(X )]+,  where  [A]+  =  max { X .  0},  and 
observe  that  cpp  is  proper  and  satisfies  (A0)-(A2)  and  (A4)  if  p  is  coherent,  and,  moreover,  cpp(r ])  = 
2[p]+  >  t]  for  all  real  i)  ^  0.  Finally,  min^  t]  +  (pp{X  —  p)  =  min^  t]  +  2 [p(X  —  p)]+  =  min^  t]  + 
2[p( X)  —  p]+  =  p(X),  i.e.,  any  coherent  p  can  be  represented  in  the  form  of  (2).  □ 

Remark  2.5.  It  is  easy  to  see  from  the  proof  of  Lemma  2.2  that  the  function  cp  in  representation  (2)  is  not 
determined  uniquely  for  any  given  coherent  measure  p.  Indeed,  one  can  choose  (among  possibly  others) 
cp(X)  =  o:[p(X)]-|-  for  any  a  >  1. 

Next,  we  show  that  the  infimal  convolution  representation  (2)  can  be  generalized  to  convex  measures 
of  risk.  Technically,  the  proof  of  Proposition  2.1  in  Krokhmal  (2007)  relies  heavily  on  the  positive 
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homogeneity  property  (A3)  of  coherent  risk  measures,  but  as  we  demonstrate  below,  it  can  be  amended 
in  order  to  circumvent  this  issue.  Recall  that,  given  a  proper,  l.s.c.,  convex  function  /  on  W  and 
x  e  dom  /,  its  recession  function  (/ 0+)(y)  can  be  defined  as 


(/0+)(y) 


lim 

T — >00 


/(x  +  ry)-/(x) 
r 


Note  that  the  above  expression  does  not  depend  on  x  e  dom/,  hence  (/0+)(y)  is  well-defined  (Rock- 
afellar,  1997,  Theorem  8.5).  The  result  established  below  mirrors  that  of  Proposition  2.1  in  the  case  of 
convex  measures  of  risk. 


Lemma  2.3.  Suppose  that  a  proper  function  fi  satisfies  axioms  (A0)-(A2),  and,  additionally,  is  such  that 
fi(q)  >  qfor  all  constant  q  /  0  and  fi( 0)  =  0.  Then  the  infimal  convolution  p(X)  =  i n If  q  +  fi(X  —  q) 
is  a  proper  convex  risk  measure.  Moreover,  the  infimum  is  attained  for  all  X,  and  can  be  replaced  with 
min v. 


Proof  For  any  fixed  X  e  X  consider  function  fixiq)  —  rl  +  fi(X  —  q).  Clearly,  since  fi  is  proper,  l.s.c. 
and  convex,  fix  is  l.s.c.,  convex  in  q  and  fix  >  —  oo  for  all  q.  Next  we  will  show  that  the  infimum  in 
the  definition  of  p  is  attained  for  any  X.  First,  suppose  that  dom  fix  =  0,  hence  p(X )  =  +oo,  and  the 
infimum  in  the  definition  is  attained  for  any  q  e  K.  Now,  assume  that  there  exists  rj  e  dom  fix,  and 
consequently  both  fi(X  —  rj)  <  +oo  and  p{ X )  <  +oo.  Recall  that  a  proper,  l.s.c.  function  fix  attains  its 
infimum  if  it  has  no  directions  of  recession  (see  Rockafellar,  1997,  Theorem  27.2),  or  in  other  words,  if 
fix  0+(£)  >  0  for  all  £  /  0.  Observe  that 


(fixo+m 


lim 

r— >-oo 


i)  +  +  fi(X  -  rj  -  r£)  -  rj  -  fi{X  -  rj) 

x 

X -rj 


f+  lim  lim  dtf-1  -A 

t — ^oo  r  t — ^oo  V  r  / 


where  the  last  inequality  follows  from  Remark  2.4  for  sufficiently  large  r.  Since  fi  is  l.s.c.  and  fi(fi)  >  £ 
for  all  £  /  0,  we  can  conclude  that  liniT^oo  fi(-^-  —  >  fi(—lf)  >  —tj,  whereby  (fix 0+)(^)  >  0  for 

all  ^  /  0,  which  guarantees  that  the  infimum  in  the  definition  is  attained,  and  p(X )  =  min^  q+fi{X—q). 
Next,  we  verify  that  axiom  (A0)  holds.  As  shown  above,  for  any  X  €  X  there  exists  qx  such  that 
p(X)  =  qx  +  fi(X  -  qx)-  Consequently, 

lim  inf  p(Y)  =  lim  inf  (qy  +  fi(Y  —  qy))  >  lim  inf  (qx  +  fi(Y  —  qx)) 

Y  — Y  — '  '  Y  — ^X  '  ' 

=  qx  +  lim  inf  fi  (Y  -  qx)  >  qx  +  fi(X  -  qx)  =  p(X), 

Y  — ^  X 


where  the  last  inequality  holds  due  to  lower  semicontinuity  of  fi.  Whence,  by  definition,  p  is  l.s.c. 
Verification  of  properties  (A1)-(A3)  is  straightforward  and  can  be  taken  from  Krokhmal  (2007),  Theorem 


1. 


□ 


Lemma  2.4.  Let  p  be  a  convex  measure  of  risk  such  that  p(  0)  =  0.  Then  there  exists  a  proper  function 
fi  :  X  i->  R  that  satisfies  axioms  of  monotonicity  and  convexity,  is  lower  semicontinuous,  fi(q)  >  qfor 
all  q  /  0,  and  such  that  p(X)  —  min^  q  +  fi(X  —  q). 


Proof.  Analogously  to  Lemma  2.2,  one  can  take  fip(X)  =  2 [p(V)]+. 


□ 
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Combining  the  above  results,  we  obtain  a  general  conclusion. 

Theorem  2.5.  A  proper,  l.s. c.  function  p :  X  i->  R  is  a  convex  ( respectively ,  coherent )  measure  of  risk 
if  and  only  if  there  exists  a  proper,  l.s.c.  function  0  :  i — >  M.,  which  satisfies  the  axioms  of  monotonicity 
and  convexity  (and,  respectively,  positive  homogeneity ),  <p(q)  >  qfor  all  q  f  0,  0(0)  =  0,  and  such  that 
p(X)  =  min^  q  +  cp(X  -  rj). 

The  importance  of  inlimal  convolution  representation  (2)  for  convex/coherent  risk  measures  lies  in  the 
fact  that  it  is  amenable  for  use  in  stochastic  programming  problems  with  risk  constraints  or  risk  objectives 
(note  that  the  problem  does  not  necessarily  have  to  be  convex). 

Lemma  2.6.  Let  p  be  a  coherent  measure  of  risk,  and  for  some  F,  H  :  X  m>-  M  and  C(X)  C  X  consider 
the  following  risk-constrained  stochastic  programming  problem: 

min{F(X)  :  p(X)  <  H(X),  X  e  C(X)}.  (3) 

Then,  for  a  given  convolution  representation  (2)  of  p,  problem  (3)  is  equivalent  to  a  problem  of  the  form 

min{F(X)  :  q  +  (p(X  -  rj)  <  H(X),  X  6  C(X),  q  e  M},  (4) 

in  the  sense  that  if  (3)  is  feasible,  they  achieve  minima  at  the  same  values  of  the  decision  variable 
X  and  their  optimal  objective  values  coincide.  Moreover,  if  risk  constraint  is  binding  at  optimality 
in  (3),  then  (X* ,  q*)  delivers  a  minimum  to  (4)  if  and  only  if  X*  is  an  optimal  solution  of  (3)  and 
q*  e  argmin{?7  +  4>(X*  —  q)}. 

Proof  Analogous  to  that  in  Krokhmal  (2007),  Theorem  3.  □ 

Additionally,  representation  (2)  conveys  the  idea  that  a  risk  measure  represents  an  optimal  value  or 
optimal  solution  of  a  stochastic  programming  problem  of  special  form. 

2.3  Convolution  Representation  and  Certainty  Equivalents 

The  infimal  convolution  representation  (2)  allows  for  construction  of  convex  or  coherent  measures  of  risk 
that  directly  employ  risk  preferences  of  a  decision  maker  through  a  connection  to  the  expected  utility 
theory  of  von  Neumann  and  Morgenstern  (1944).  Assuming  without  loss  of  generality  that  the  loss/cost 
elements  X  e  X  arc  such  that  —X  represents  wealth  or  reward,  consider  a  non-decreasing,  convex 
deutility  function  u  :  E  k  E  that  quantities  dissatisfaction  of  a  risk-averse  rational  decision  maker 
with  a  loss  or  cost.  Obviously,  this  is  equivalent  to  having  a  non-decreasing  concave  utility  function 
u(t)  =  —v(—t).  By  the  inverse  of  t>  we  will  understand  function  v~1(a)  —  sup  {(eM:  i ft)  —  a). 

Remark  2.6.  Note  that  if  a  non-decreasing,  convex  v(t)  ^  const  then,  according  to  the  definition  above, 
the  inverse  is  finite,  and  moreover,  if  there  exists  t,  such  that  i ft)  —  a  <  +oo,  then  lA1  (a)  —  max{i  e 
M  |  v(t)  =  a}.  Additionally,  let  n_1(+oo)  =  +oo. 

Then,  for  any  given  a.  e  (0,  1),  consider  function  0  in  the  form 

0(I)=  v-'EviX),  (5) 

1  —  a 
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where  we  use  an  operator-like  notation  for  v  ,  i.e.,  v  1  Et;(  A )  =  v  1  (Ev(X )). 

Expression  CE(X)  =  v~]  Ev(X)  represents  the  certainty  equivalent  of  an  uncertain  loss  X,  a  deter¬ 
ministic  loss/cost  such  that  a  rational  decision  maker  would  be  indifferent  between  accepting  CE(X)  or 
an  uncertain  X  \  it  is  also  known  as  quasi-arithmetic  mean ,  Kolmogorov  mean ,  or  Kolmogorov -Nagumo 
mean  (see,  among  others,  Bullen  et  ah,  1988;  Hardy  et  ah,  1952).  Certainty  equivalents  play  an  impor¬ 
tant  role  in  the  decision  making  literature  (see,  for  example,  Wilson,  1979;  McCord  and  Neufville,  1986); 
in  the  context  of  modern  risk  theory,  certainty  equivalents  were  considered  in  the  work  of  Ben-Tal  and 
Teboulle  (2007). 

In  order  for  function  (p  as  defined  by  (5)  to  comply  with  the  conditions  of  Theorem  2.5,  the  deutility 
function  should  be  such  that  cp(q)  =  jz^v~1v(q)  >  q  for  i)  /  0.  This  necessarily  implies  that 
v(r])  =  i>(0)  for  all  q  <  0,  provided  that  v  is  proper,  non-decreasing  and  convex.  Indeed,  if  v{rf )  <  i;(0) 
for  some  if  <  0,  then  according  to  the  above  remarks  u_1  v(if)  =  max { q  :  v(q)  =  v(rf)}  =  if* , 
where  t]**  is  such  that  q*  <  t]**  <  0  and  v~1v(q**)  =  q**,  whence  (p(q**)  =  (1  —  a)~xq**  <  q**. 

Additionally,  without  loss  of  generality  it  can  be  postulated  that  t>(0)  =  0,  i.e.,  zero  loss  means  zero 
dissatisfaction.  Indeed,  u_1Eti(A)  =  u-1Ei>(A)  for  v(t)  —  v(t)  —  t>(0),  i.e.,  such  a  transformation  of 
the  deutility  function  does  not  change  the  value  of  the  certainty  equivalent.  Similarly,  it  is  assumed  that 
v(t)  >  0  for  all  t  >  0.  This  condition  represents  a  practical  consideration  that  positive  losses  entail 
positive  deutility/dissatisfaction,  and  is  not  restrictive  from  methodological  point  of  view.  Indeed,  it  can 
be  shown  that  if  one  allows  for  to  =  max  { l  :  v(t )  =  0}  >  0,  then  the  risk  measures  based  on  <p  as 
given  in  (5)  with  deutilities  v(t)  and  i>o (t)  =  v(t  +  to),  such  that  i>o(0  >  0,  t  >  0,  will  differ  only  by  a 
constant. 


To  sum  up,  we  consider  non-decreasing,  convex  deutility  function  v  :  M  m>-  M  such  that 

v(t)  =  u([r]+)  = 


v(t)  >0,  t  >  0, 
0,  t  <  0. 


We  will  refer  to  such  a  function  as  a  one-sided  deutility.  Then,  using  the  corresponding  function  (j>  in 
representation  (2)  one  obtains  a  class  of  certainty  equivalent  measures  of  risk: 

1 


p(X)  =  min  q  + 

v  1  —  a 

1 

=  min  q  -\ - 1 

v  1  —  a 


v  1Ev(X  —  q) 

1  Eu(|W  -  q]+). 


(6) 


Next  we  analyze  the  conditions  under  which  formulation  (6)  yields  a  coherent  or  convex  measure  of  risk. 
Recall  that  we  assume  the  space  X  to  be  such  that  certainty  equivalent  above  is  well-defined,  particularly, 
integrability  condition  is  satisfied. 

Proposition  2.7.  If  v  is  a  one-sided  deutility  function,  then  4>{X)  =  Ev(X)  is  proper,  l.s.c., 

satisfies  the  axiom  of  monotonicity  and  (p(q)  >  q  for  all  q  ^  0. 


Proof  Clearly,  such  a  </>  is  proper  and  l.s.c.  The  monotonicity  property  of  (p  defined  by  (5),  cp(X )  < 
4>(Y)  for  all  X  <  Y ,  follows  from  both  v  and  n_1  being  non-decreasing.  Finally,  note  that 


cp(q) 


-v  1v(q)  =  — v  1v([q]+) 
1  -  a 


1  —  a 

- - sup  {i  :  v(t)  =  «([/?]+)}  >  t - [t]]+  >  h 

l  —  a  1  —  a 
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for  all  q  ^  0. 


□ 


From  Proposition  2.7  we  can  conclude  that  in  order  for  the  conditions  of  Theorem  2.5  to  be  satisfied  we 
only  need  to  guarantee  convexity  of  the  certainty  equivalent  (5)  (note  that  axiom  (A4)  is  satisfied  if  cer¬ 
tainty  equivalent  itself  is  positive  homogeneous).  A  sufficient  condition  of  this  type  has  been  established 
in  Ben-Tal  and  Teboulle  (2007). 


Proposition  2.8  (Ben-Tal  and  Teboulle,  2007).  If  v  e  C3(M)  is  strictly  convex  and  —  is  convex,  then 

v" 

the  certainty  equivalent  i>-1Eu  is  also  convex. 


The  following  observation  adapts  this  result  to  establish  convexity  of  certainty  equivalents  in  the  case  of 
one-sided  deutility  functions. 

vf 

Corollary  2.9.  Ifv  6  C3[0,  oo)  is  strictly  convex  and  —  is  convex  on  [0,  +oo),  then  certainty  equivalent 

v 

v+l£v+  is  convex,  where  u+(/)  =  n([t]+). 

Proof.  Indeed,  note  that  n!jl1Ei;+(2()  =  vf1  Et>([2(]+)  =  r>_1Ei)([X]-|_),  which  is  convex  as  a  superpo¬ 
sition  of  a  convex  (Proposition  2.8  for  function  v)  and  a  non-decreasing  convex  functions.  □ 

Remark  2.7.  Conditions  of  Proposition  2.8  arc  only  sufficient,  i.e.,  it  is  possible  for  a  certainty  equiv¬ 
alent  to  be  convex  without  satisfying  these  conditions  (as  is  shown  in  Corollary  2.9).  Moreover,  these 
conditions  arc  rather  restrictive.  Thus,  it  is  worth  noting  that  if  v  is  a  one-sided  deutility  function  such 
that  the  corresponding  certainty  equivalent  is  convex,  then  the  certainty  equivalent  measure  p  defined  by 
(6)  is  a  convex  (or  coherent)  measure  of  risk,  regardless  of  whether  Corollary  2.9  holds.  At  the  same 
time,  this  result  can  be  useful,  as  it  is  demonstrated  by  Proposition  3.1. 

Observe  that  if  function  <p  is  taken  in  the  form  (5),  where  v  is  a  one-sided  deutility,  the  structure  of 
the  resulting  risk  measure  (6)  allows  for  an  intuitive  interpretation,  similar  to  that  proposed  by  Ben- 
Tal  and  Teboulle  (2007).  Consider,  for  instance,  a  resource  allocation  problem  where  X  represents  an 
unknown  in  advance  cost  of  resources  necessary  to  cover  future  losses  or  damages.  Assume  that  it 
is  possible  to  allocate  amount  q  worth  of  resources  in  advance,  whereby  the  remaining  part  of  costs, 
[X  —  q]+,  will  have  to  be  covered  after  the  actual  realization  of  X  is  observed.  To  a  decision  maker  with 
deutility  v,  the  uncertain  cost  remainder  [X  —  q\+  is  equivalent  to  the  deterministic  amount  of  certainty 
equivalent  i;_1  Ev([X  —  q]+).  Since  this  portion  of  resource  allocation  is  “unplanned”,  an  additional 
penalty  is  imposed.  If  this  penalty  is  modeled  using  a  multiplier  jf- ,  then  the  expected  additional 
cost  of  the  resource  is  y^t>-1Ei>([A  —  q]+).  Thus,  the  risk  associated  with  the  mission  amounts  to 
q  +  ( 1  —  a)-1  d-1Ei>([A  —  q]+),  and  can  be  minimized  over  all  possible  values  of  q,  leading  to  definition 
(6).  Moreover,  when  applied  to  the  general  definition  (2),  this  argument  provides  an  intuition  behind 
the  condition  f  ( q )  >  q  above.  Indeed,  the  positive  difference  f(q)  —  q  can  be  seen  as  a  penalty  for  an 
unplanned  loss. 

We  also  note  that  certainty  equivalent  representation  (6)  for  coherent  or  convex  measures  of  risk  is  related 
to  the  optimized  certainty  equivalents  (OCEs)  due  to  Ben-Tal  and  Teboulle  (2007), 

OCE(W)  =  sup  q  +  Eu(X  —  q).  (7) 

v 
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While  interpretations  of  formulas  (6)  and  (7)  are  similar,  and  moreover,  it  can  be  shown  that,  under 
certain  conditions  on  the  utility  function,  p(X )  =  —  OCE(A)  is  a  convex  measure  of  risk,  there  arc 
important  differences  between  these  representations.  In  (7),  the  quantity  being  maximized  is  technically 
not  a  certainty  equivalent,  while  the  authors  have  argued  that  specific  conditions  on  the  utility  function 
u  allowed  them  to  consider  it  as  one.  In  addition,  representation  (7)  entails  addition  of  values  with 
generally  inconsistent  units,  e.g.,  dollars  and  utility.  Finally,  as  shown  above,  representation  (6)  allows 
for  constructing  both  coherent  and  convex  measures  of  risk,  while  the  OCE  approach  yields  a  coherent 
risk  measure  if  and  only  if  the  utility  function  is  piecewise  linear. 

Remark  2.8.  It  is  straightforward  to  observe  that  by  choosing  the  one-sided  deutility  function  in  (6)  in  the 
form  v(t )  =  [/]+  one  obtains  the  well-known  Conditional-Value-at-Risk  (CVaR)  measure  (Rockafellar 
and  Uryasev,  2002),  while  one-sided  deutility  v ( t )  =  [l}+  yields  the  Higher-Moment  Coherent  Risk 
(HMCR)  measures  (Krokhmal,  2007). 

Remark  2.9.  In  general,  risk  measure  p  is  called  a  tail  measure  of  risk  if  it  quantifies  the  risk  of  X 
through  its  right  tail,  [X  —  c]+,  where  the  tail  cutoff  point  c  can  be  adjusted  according  to  risk  preferences 
(Krokhmal  et  ah,  2011).  Observe  that  the  above  analysis  implies  that  coherent  or  convex  risk  measures 
based  on  certainty  equivalents  (6)  are  necessarily  tail  measures  of  risk  (see  also  Propositions  2.14  and 
2.15  below). 

Another  key  property  of  the  certainty  equivalent  measures  of  risk  (6)  is  that  they  “naturally”  preserve 
stochastic  orderings  induced  on  the  space  X  of  random  outcomes  by  the  utility  function  u  or,  equiva¬ 
lently,  deutility  v.  Assuming  again  that  X  is  endowed  with  necessary  integrability  properties,  consider 
the  properties  of  isotonicity  with  respect  to  second  order  stochastic  dominance  (SSD)  (see,  e.g.,  De 
Giorgi,  2005;  Pflug,  2006): 

(Al')  SSD  isotonicity:  p(X)  <  p(Y)  for  all  X,  Y  e  X  such  that  —X  ^ssd  —  Y, 

and,  more  generally,  isotonicity  with  respect  to  k-th  order  stochastic  dominance  (kSD): 

(Al")  kSD  isotonicity:  p(X )  <  p(Y)  for  all  X.Ye  X  such  that  —X  Xkso  —Y, 
for  a  given  k  >  1 . 

Recall  that  random  outcome  X  is  said  to  dominate  outcome  Y  with  respect  to  second-order  stochastic 
dominance,  X  ^ssd  Y,  if 

[  Fx($)  d£  <  [  Fy($)  d£  for  all  t  e  R, 

J—O O  J  —  O O 

where  Fz(t)  =  P{Z  <  i  J  is  the  c.d.f.  of  a  random  element  Z  e  X.  Similarly,  outcome  X  dominates 
outcome  Y  with  respect  to  k-th  order  stochastic  dominance,  X  ^sd  Y,  if 

F{xk](t  )  <  Fj\t),  for  all  t  e  M, 

where  Fx\t)  =  Fx  ^(^)d^  and  Fx\t)  =  P(  A  <  1 1  (see,  for  example,  Ogryczak  and 
Ruszczynski,  2001).  Stochastic  dominance  relations  in  general,  and  SSD  in  particular  have  occupied 
a  prominent  place  in  decision  making  literature  (see,  for  a  example,  Levy  (1998)  for  an  extensive  ac¬ 
count),  in  particular  due  to  a  direct  connection  to  the  expected  utility  theory.  Namely,  it  is  well  known 
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(Rothschild  and  Stiglitz,  1970)  that  X  ^ssd  F  if  and  only  if  E u(X)  >  Ew(F)  for  all  non-decreasing  and 
concave  utility  functions  u,  i.e.,  if  and  only  if  Y  is  never  preferred  over  X  by  any  rational  risk-averse 
decision  maker.  In  general,  it  can  be  shown  that  X  ^kSD  Y  if  and  only  if  E u(X)  >  Ew(F)  for  all 
u  6  U'*',  where  is  a  specific  class  of  real-valued  utility  functions;  particularly,  U  ^  consists  of  all 
non-decreasing  functions,  U ®  contains  all  non-decreasing  and  concave  functions,  L'(3)  amounts  to  all 
non-decreasing,  concave  functions  with  convex  derivative,  and  so  on  (see,  for  example,  Fishburn  (1977) 
and  references  therein).  This  characterization  of  k SD  dominance  relation  naturally  implies  that  the  pro¬ 
posed  certainty  equivalent  representation  yields  risk  measures  that  arc  necessarily  kSD-isotonic,  given 
that  the  set  of  considered  deutility  functions  is  appropriately  restricted. 

Proposition  2.10.  If  deutility  function  v  is  such  that  —v(—t)  €  U^\  then  risk  measure  p  given  by  the 
certainty  equivalent  representation  (6)  is  kSD-isotonic,  i.e.,  satisfies  (Al"). 

Proof.  Follows  immediately  from  the  definitions  of  k SD  dominance,  k  SD  isotonicity,  and  the  above 
discussion.  □ 

Corollary  2.11.  If  a  real-valued  function  v  is  a  one-sided  deutility,  then  (6)  defines  a  risk  measure  that 
is  isotonic  with  respect  to  second  order  stochastic  dominance. 

Note  that  Proposition  2.10  does  not  require  the  certainty  equivalent  in  (6)  to  be  convex.  In  this  context,  the 
certainty  equivalent  representation  (6)  ensures  that  the  risk-averse  preferences  expressed  by  a  given  utility 
(equivalently,  deutility)  function  arc  "transparently”  inherited  by  the  corresponding  certainty  equivalent 
measure  of  risk. 

2.4  Optimality  Conditions  and  Some  Properties  of  Optimal  q 

Consider  the  definition  of  Conditional  Value-at-Risk, 

CVaRa(.Y)  =  min  q  -\ - E[V  —  q]+. 

v  1  —  a 

The  lowest  value  of  q  that  delivers  minimum  in  this  definition  is  know  in  the  literature  as  Value-at-Risk 
(VaR)  at  confidence  level  a,  and  while  VaR  in  general  is  not  convex,  it  is  widely  used  as  a  measure  of 
risk  in  practice,  especially  in  financial  applications  (Jorion,  1997;  Duffie  and  Pan,  1997).  Thus,  it  is  of 
interest  to  investigate  some  properties  of  q*(X)  e  arg  min  { q  +  j^v~1Ev(X  —  r/)J .  First,  we  formulate 
the  necessary  and  sufficient  optimality  conditions. 

Proposition  2.12.  Suppose  that  v  is  a  non-decreasing  and  convex  function,  certainty  equivalent  i>-1Ed 
is  convex  and  E3±t;(V  —  q*)  is  well  defined,  then  q*  €  arg  min  {q  +  y^tV1  Ei;(A  —  q )}  if  and  only  if 

3_u-1(E v(X  -  q*))  ■  E8_v(X  -  q*) 

<  1  -  cr  <  3+n_1(E v(X  -  q*))  ■  Ed+v(X  -  q*), 

where  3±i)  denote  one-sided  derivatives  ofv  with  respect  to  the  argument. 

Proof.  Let  us  denote  <fix(v)  =  q  +  Eu(V  —  q).  Since  certainty  equivalent  n_1Et>  is  convex,  fix 

is  also  convex,  and  thus,  it  has  left  and  right  derivatives  everywhere  on  dorn  fix  f  0,  and  q  delivers  a 
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minimum  to  fx  if  and  only  if  d_fx(h)  —  0  —  In  what  follows,  we  determine  closed  form 

expressions  for  left  and  right  derivatives  of  fx-  By  definition,  if  rj  €  dom  (j)x  then 


d+4>x(ri)  =  lim 


+  e)  -  (px(rj) 


fil-0 

=  1  + 


1 


,-i 


lim 

1  —  a  £4_o 


Ev(X  —  rj  —  e)  —  v  1 E  v(X  —  rj) 


Repeating  a  usual  argument  used  to  prove  the  chain  rule  of  differentiation  (see,  e.g.,  Randolph,  1952), 
we  can  define 


Q(y)  = 


v  Uy)  —  v  1Eu(^l'  —  rj) 
y  —  Eu(Z  —  rj) 

3_u_1  (Ev(X  —  r /)), 


in  which  case 


3+0x(??)  =  1  +  lim  |  Q(Ev(X  -  /]-£)) 
1  —  a  (4o  ( 


y  <  Ev(X  —  rj), 
otherwise, 

Ev(X  -t]-s)-E v(X-ri) 


Clearly,  lim£|0  Q(Ev(X  —  rj  —  e))  =  3_n  1  (Ei? ( —  rjj)  by  monotone  convergence  theorem,  and  the 
only  part  left  to  find  is 

Eu(X  —  r]  —  s)  —  Ev(X  -  rj)  Eu(X  -  rj)  -  Ev(X  -  rj  -  s) 

lim -  =  —  lim - . 


e),0  £ 

v(x  —  rj)  —  v(x  —  ij  —  e) 


Observe  that  lim 

£),0 

exists  since  v  is  convex).  Moreover, 

v(x  —  rj)  —  v(x  —  r]  —  e) 


el-O 


=  d_v(x  —  rj)  for  any  fixed  x  e  K  (note  that  d_v(x  —  rj) 


/■  3_t>(x:  —  rj)  as  s  \  0, 


where  /*  denotes  monotonic  convergence  from  below  (Rockafcllar,  1997,  Theorem  23.1).  Thus,  by 
monotone  convergence  theorem,  we  can  interchange  the  limit  and  expectation: 

Ev(X  —  rj)  —  Eu(2(  —  rj  —  s)  v(X  —  rj)  —  v(X  —  rj  —  e) 

lim -  =  E  lim - 


el-O  £ 

=  E3_u(X  -  rj), 


i.e.,  d+4>x(r})  —  1  —  yz^3_u  1  (Eu(3(  —  rj))  -Ed_v(X  —  rj).  Similar  arguments  can  be  invoked  to  evaluate 
d-cjxiv)  in  order  to  complete  the  proof.  □ 

Corollary  2.13.  Condition 


(v  1)/(Eu(Jf  —  rj))Ev'(X  —  rj)  —  1  —  a 

is  sufficient  for  t]  to  deliver  the  minimum  in  (5),  given  that  (y~ 1 )'  and  v'  are  well-defined. 

Conditions  established  above  show  that  for  a  fixed  X,  the  location  of  rj*(X)  is  determined  by  the  param¬ 
eter  a.  Two  propositions  below  illustrate  this  observation. 
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Proposition  2.14.  Given  an  X  e  dom p  for  all  a  e  (0,  1),  ifq*(X)  €  arg min  q  +  —  q), 

where  v  is  a  one-sided  deutility  function,  and  certainty  equivalent  i>-1  Eu  exists  for  any  X,  and  is  convex, 
then  ^(A)  <  q*2  ( X )  for  any  a i  <  «2- 

Proof  Below  we  will  use  ?7*(A)  and  q*  interchangeably  in  order  to  simplify  the  notation.  Let  a  \  <  a2- 
Since  v  is  a  one-sided  deutility,  then  v{X  —  rj)  —  u([A  —  q]+),  and  by  the  definition  of  q*(X), 

Vai  +  j-^-v-1Ev([A  -  dj+)  <  ha2  +  -j-^-u-1Eu([A  -  r)l2]+)- 
Suppose  that  Vtn  >  C’  then  one  has 

°<Va1-  Va2  <  y^^-(t;_1Et;([A  -  r) *2]+)  -  f_1Ei;([A  -  >?*,]  +  )) 

<  y~^(u_1ei;([x  _  ^2]+) _  -  >?£,]+))• 

This  immediately  leads  to 

+  73— w_1Ei;([a  “  ^«J+)  <  ^“2  +  -  »&2]+). 

i  14-2  i  Lt,  2 

which  contradicts  with  the  definition  of  q*2 ,  thus  furnishing  the  statement  of  the  proposition.  □ 

Proposition  2.15.  Given  an  X  e  dom  p  for  all  a  e  (0,  1),  ifq*(X)  €  arg  min  q  +  yA^t>_1Et;(A  —  q), 
where  v  is  a  one-sided  deutility  function,  and  certainty  equivalent  v~ 1  Et;  exists  for  any  X  and  is  convex, 
then 

lim  q*(X )  =  ess.sup(A). 
a — y  1 

Proof  Again,  let  us  consider  function  fe(?j)  =  q  +  yiyyi>_1Ei;(A  —  q),  and  since  v  is  a  one-sided 
deutility,  fx(v)  =  q  +  fx>r]  ~  J?) dP.  Suppose  that  ess.sup(A)  =  A  <  +oo,  consequently 

P(A  >  A  —  e)  >  0  for  any  s  >  0.  Note  that  fed)  =  A.  Now, 

fed  —  e)  =  A  —  e  -\ - - — n_1  [  v(X  —  A  +  £)dP 

1  -  a  JX>A—e 

>  A  —  e  -\ - u-1  f  v(X  —  A  +  £)dP 

1-“  JX>A -f 

>  ^  -  e  +  A-„-  (»(|)P(1T  >  A  -  0)  =  -4  -  .  + 

where  Me  =  u_1^n(|)P(A  >  -<  - !))  >  0.  Hence,  fe  ( A  —  £)  >  fed)  for  any  sufficiently  large 

values  of  a,  which  means  that  in  this  case  any  ?]*  ( A )  e  arg  min  q  +  y— Et;(A  —  q)  has  to  satisfy 
?j*(A)  s  (A  —  £,  A],  and  thus  lima-^i  q^(X)  —  A  —  ess.sup(A). 

Now,  let  ess.sup(A)  =  +oo.  Note  that  fx>J]  v(X  —  q) dP  is  a  non-increasing  function  of  q.  Let  A  €  M 
and  <px(A)  —  A  +  yzyyU-1  fx>A  v d  —  A)dP.  Since  ess.sup(A)  =  +oo,  there  exists  A  >  A  such  that 
0  <  fx>Av(X-A)  dP  <  fx>~v(X-A)  dP.  Thus,  fe(l)  =  A+j^v-1  fx^Av(X-A)  dP  <  fed) 
for  any  sufficiently  large  a,  which  yields  q*(X)  >  A.  Since  the  value  of  A  has  been  selected  arbitrarily, 
lima-*!  ?7*(A)  =  +oo  =  ess.sup(A).  □ 
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3  Application:  Log-Exponential  Convex  Measures  of  Risk 


As  it  was  already  mentioned  above,  CVaR  and  HMCR  measures  can  be  defined  in  terms  of  the  proposed 
certainty  equivalent-based  representation  (6).  Note  that  both  cases  correspond  to  positively  homogeneous 
functions  </>,  and,  therefore,  arc  coherent  measures  of  risk.  Next  we  consider  a  convex  measure  of  risk 
resulting  from  the  certainty  equivalent  representation  (6)  with  an  exponential  one-sided  deutility  function 

v(t)  =  -1  +  aM+: 

p^\X)  =  min  r]  -\ - - — log^  EA^x'-^+,  where  A  >  1  and  a  e  (0,  1).  (8) 

v  1  —  a 

We  refer  to  such  p„" 1  as  the  family  of  log-exponential  convex  risk  (LogExpCR)  measures.  First,  using  the 
general  framework  developed  above,  it  can  be  readily  seen  that  LogExpCR  family  are  convex  measures 
of  risk. 

Proposition  3.1.  Functions  p^(X)  defined  by  (8)  are  proper  convex  measures  of  risk. 

Proof.  Follows  immediately  from  Theorem  2.5,  Proposition  2.7,  and  Corollary  2.9.  □ 

A  particular-  member  of  the  family  of  LogExpCR  measures  is  determined  by  the  values  of  two  parameters, 
a  and  A.  Recall  that  in  Section  2.4  we  have  established  that  parameter  a  plays  a  key  role  in  determining 
the  position  of  rj*(X)  e  argmin  t]+  En(Af  —  77),  particularly,  oq  <  «2  leads  to  r/*  (X)  <  t]*7(X), 

and  liniQ.^1  )]*{X )  =  ess.sup(A).  These  two  properties  allow  us  to  conclude  that  a  determines  the 
“length”  of  the  tail  of  distribution  of  X,  or,  in  other  words,  determines  which  part  of  the  distribution 
should  be  considered  “risky”.  This  is  in  accordance  with  a  similar  property  of  the  CVaR  measure,  which, 
in  the  case  of  a  continuous  loss  distribution,  quantifies  the  risk  as  the  expected  loss  in  the  worst  1  —  a 
percent  of  the  cases.  See  Krokhmal  (2007)  for  a  similar  argument  for  HMCR  measures. 

Furthermore,  one  has 

pW(V)  =  min  n  +  — —  log,  EA[X_,?]+  =  min  ri  +  — - —  In  EelnXlx~v]+ 

v  1  —  a  v  1— a  In  A 

=  — min  p  In  A  +  ^  in  X]+  = 

In  A  v  1  —  a 

—  min  rf  +  -J—Ee^inX-v']+  =  J_p(f)(jnn  A). 

In  A  r)'  l  —  a  In  A 

This  implies  that  LogExpCR  measures  satisfy  a  “quasi  positive  homogeneity”  property: 

p£\X)  In  A  =  Pa\X  In  A), 

where  parameter  In  A  plays  the  role  of  a  scaling  factor.  Thus,  in  the  case  of  log-exponential  convex 
measures  of  risk  (8),  scaling  can  be  seen  as  a  way  to  designate  the  total  range  of  the  loss  variable. 
Consequently,  a  combination  of  the  parameters  a  and  A  determines  both  the  region  of  the  loss  distribution 
that  should  be  considered  “risky”,  and  the  emphasis  that  should  be  put  on  the  larger  losses.  Note  that  the 
specific  values  of  these  parameters  depend  on  the  decision  maker’s  preferences  and  attitude  towards  risk. 
In  practice,  they  may  be  determined  and/or  calibrated  through  preliminary  computational  experiments. 

It  is  of  interest  to  note  that  LogExpCR  measures  are  isotonic  with  respect  to  any  order  k  >  I  of  stochastic 
dominance: 
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Proposition  3.2.  The  family  of  log-exponential  convex  measures  of  risk  (8)  are  kSD-isotonic  for  any 
k  >  1,  i.e.,  p(a\x)  <  p£\Y)  for  all  X,  Y  e  X  such  that  -X  *= kSD -Y . 

Proof  Follows  immediately  from  Proposition  2. 10  for  v  defined  above.  □ 

Based  on  these  observations  and  the  preceding  discussion,  we  can  conclude  that  the  introduced  family 
of  LogExpCR  measures  possesses  a  number  of  desirable  properties  from  both  optimization  and  method¬ 
ological  perspectives.  It  is  widely  acknowledged  in  the  literature  that  risk  is  associated  with  “heavy” 
tails  of  the  loss  distribution;  for  example,  in  Krokhmal  (2007)  it  has  been  illustrated  that  evaluating  risk 
exposure  in  terms  of  higher  tail  moments  can  lead  to  improved  decision  making  in  financial  applications 
with  heavy-tailed  distributions  of  asset  returns.  Furthermore,  there  are  many  real-life  applications  where 
risk  exposure  is  associated  with  catastrophic  events  of  very  low  probability  and  extreme  magnitude,  such 
as  natural  disasters,  which  often  turn  out  to  be  challenging  for  traditional  analytic  tools  (see,  for  exam¬ 
ple,  Kousky  and  Cooke,  2009;  Cooke  and  Nieboer,  201 1  and  references  therein,  Iaquinta  et  al.,  2009,  or 
Kreinovich  et  al.,  2012).  By  construction,  LogExpCR  measures  quantify  risk  by  putting  extra  emphasis 
on  the  tail  of  the  distribution,  which  allows  us  to  hypothesize  that  they  could  perform  favorably  compared 
to  conventional  approaches  in  situations  that  involve  heavy-tailed  distributions  of  losses  and  catastrophic 
risks.  This  conjecture  has  been  tested  in  two  numerical  case  studies  that  are  presented  next.  The  idea 
is  to  evaluate  the  quality  of  solutions  based  on  the  risk  estimates  due  to  nonlinear  LogExpCR  measure 
with  those  obtained  using  linear  CVaR  measure,  which  can  now  be  considered  as  a  standard  approach  in 
risk-averse  applications.  Particularly,  we  were  interested  in  assessing  the  influence  that  the  behavior  of 
the  tails  of  the  underlying  losses  distributions  has  in  this  comparison. 

3.1  Case  Study  1:  Flood  Insurance  Claims  Model 

Dataset  description  For  the  first  paid  of  the  case  study  we  used  a  dataset  managed  by  a  non-profit 
research  organization  Resources  for  the  Future  (Cooke  and  Nieboer,  2011).  It  contains  flood  insurance 
claims,  tiled  through  National  Flood  Insurance  Program  (NFIP),  aggregated  by  county  and  year  for  the 
State  of  Florida  from  1980  to  2006.  The  data  is  in  2000  US  dollars  divided  by  personal  income  estimates 
per  county  per  year  from  the  Bureau  of  Economic  Accounts  (BEA),  in  order  take  into  account  substantial 
growth  in  exposure  to  flood  risk.  The  dataset  has  67  counties,  and  spans  for  355  months. 


Model  formulation  Let  random  vector  l  represent  the  dollar  values  of  insurance  claims  (individual 
elements  of  this  vector  correspond  to  individual  counties),  and  consider  the  following  stochastic  pro¬ 
gramming  problem,  where  p  is  a  risk  measure: 


min  p(fTx) 

(9a) 

s.  t.  Xj  —  K 

i 

(9b) 

Xi  G  {0,  1}. 

(9c) 

Such  a  formulation  allows  for  a  straightforward  interpretation,  namely,  the  goal  here  is  to  identify  K 
counties  with  a  minimal  common  insurance  risk  due  to  flood  as  estimated  by  p.  Clearly,  such  a  simplified 
model  does  not  reflect  the  complexities  of  real-life  insurance  operations.  At  the  same  time,  since  the 
purpose  of  this  case  study  is  to  analyze  the  properties  of  risk  measures  themselves,  a  deliberately  simple 
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formulation  was  chosen  so  as  to  highlight  the  differences  between  solutions  of  (9)  due  to  different  choices 
of  the  risk  measure  p  in  (9a). 

Given  that  the  distribution  of  l  is  represented  by  equiprobable  scenario  realizations  tij,  and  m  is  the 
number  of  scenarios  (time  periods),  model  (9)  with  risk  measure  chosen  as  the  Conditional  Value-at-Risk, 
p(X )  =  CVaRa(X),  can  be  expressed  as 


min  rj  +  - - - - V  —  [  V  U,j  ~  V 

1  -  acvaR  m  L  J 

j  i 

s.  t.  y^x/  =  K 

i 

Xi  e  {0, 1}. 


(10a) 

(10b) 

(10c) 


Similarly,  if  a  LogExpCR  measure  is  used,  p(X)  =  pjf\X),  then  (9)  can  be  formulated  as 

min  tj  4 - logV^ — eE iXiiij-n]+  (11a) 

1  -  O'LogExpCR  j  m 

s.  t.  y^X;  =  K  (lib) 

i 

xi  e  {0, 1}.  (11c) 


Normal  data  In  order  to  evaluate  the  effect  of  the  tail  behavior  of  the  loss  distribution  on  the  obtained 
solutions  of  decision  making  problems,  we  additionally  generated  a  similar  dataset  based  on  normal 
distribution.  Particularly,  we  draw  355  realizations  from  67 -dimensional  normal  distribution  with  mean 
//,  and  covariance  matrix  E ,  where  p,  and  £  arc  mean  and  covariance  estimates  of  NFIP  data  respectively. 
Our  goal  here  is  to  make  sure  that  the  main  difference  between  the  datasets  lays  in  the  tails  (normal 
distribution  is  a  well-known  example  of  a  light-tailed  distribution),  and  by  preserving  mean  vector  and 
covariance  matrix  we  secure  that  this  dataset  captures  the  leading  trends  present  in  the  original  data.  Now, 
by  comparing  the  decisions  due  to  CVaR  and  LogExpCR  measures  for  these  two  datasets  we  can  make 
conclusions  on  the  effects  that  the  tails  of  the  distributions  have  on  the  quality  of  subsequent  decisions. 


Implementation  details  Problems  (10)  and  (11)  represent  a  mixed-integer  linear  programming  (MIP) 
and  a  mixed-integer  non-linear  programming  (MINLP)  problems  respectively.  MIP  problems  were 
solved  using  IBM  ILOG  CPLEX  12.5  solver  accessed  through  C++  API.  For  the  MINLPs  of  the  form 
(11)  we  implemented  a  custom  branch-and-bound  algorithm  based  on  outer  polyhedral  approximation 
approach,  which  utilized  CPLEX  12.5  MIP  solver  and  MOSEK  6.0  for  NLP  subproblems  (Vinel  and 
Krokhmal,  2014). 

In  order  to  evaluate  the  quality  of  the  decisions  we  employed  a  usual  training-testing  framework.  Given  a 
preselected  value  m,  the  first  m  scenarios  were  used  to  solve  problems  (10)  and  (1 1),  then  for  the  remain¬ 
ing  N  —  m  scenarios  the  total  loss  was  calculated  as  Lp  =  X/fLm+i  X+  ^ i,jx f,  where  xp  represents 
an  optimal  solution  of  either  problem  (10)  or  problem  (11),  and  N  is  the  total  number  of  scenarios  in 
the  dataset.  In  other  words,  the  decision  vector  xp  is  selected  based  on  the  first  m  observations  of  the 
historical  data  (training),  and  the  quality  of  this  solution  is  estimated  based  on  the  “future”  realizations 
(testing). 
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In  this  case  study  we  have  set  acvaR  =  0-9,  which  is  a  typical  choice  in  portfolio  optimization  and  can 
be  interpreted  as  cutting  off  90%  of  the  least  significant  losses.  A  preliminary  test  experiment  has  been 
performed  to  select  aLogExpCR  in  such  a  way  that  approximately  same  portion  of  the  distribution  was  cut 
off,  which  yielded  ^LogExpCR  =  0.5.  For  the  sake  of  simplicity,  we  set  A  =  e. 


Discussion  of  results  Tables  1  and  2  summarize  the  obtained  results  for  NFIP  and  simulated  normal 
data  sets,  respectively.  Discrepancy  in  the  quality  of  the  decisions  based  on  LogExpCR  and  CVaR 
measures  is  estimated  using  the  value 

^LogExpCR _  ^CVaR 

y  min  |LLo§Expcr,  LCVaR}  ’ 

which  represents  the  relative  difference  in  total  losses  ELogExpCR  and  LCVaR  associated  with  the  respective 
decisions.  For  example,  y  —  —100%  corresponds  to  the  case  when  losses  due  to  CVaR-based  decision 
were  twice  as  large  as  losses  due  to  LogExpCR-based  decision. 

First  of  all,  we  can  observe  that  there  is  a  definite  variation  between  the  results  obtained  with  NFIP 
data  on  one  hand  and  with  simulated  normal  data  on  the  other.  Particularly,  the  absolute  values  of  y  in 
Table  2  on  average  arc  considerably  smaller  compared  to  those  in  Table  1,  which  indicates  that  in  the 
case  of  normal  data  the  risk  measures  under  consideration  result  in  similar  decisions,  while  heavy-tailed 
historical  data  leads  to  much  more  differentiated  decisions. 

Secondly,  Table  1  suggests  that  LogExpCR  measure  yields  considerably  better  solutions  for  certain  sets 
of  parameter  values.  Most  notably,  such  instances  correspond  to  smaller  values  of  both  K  and  m.  Intu¬ 
itively,  this  can  be  explained  as  follows.  Recall  that  m  is  the  number  of  scenarios  in  the  training  set,  and 
N  —  m  is  the  number  of  scenarios  in  the  testing  set,  which  means  that  larger  values  of  m  correspond  to 
shorter  testing  horizon.  Clearly,  the  fewer  scenarios  there  are  in  the  testing  set,  the  fewer  catastrophic 
losses  occur  during  this  period,  and  vice  versa,  for  smaller  values  of  m  there  are  more  exceptionally  high 
losses  in  the  future.  Thus,  the  observed  behavior  of  y  is  in  accordance  with  our  conjecture  that  LogEx¬ 
pCR  measures  arc  better  suited  for  instances  with  heavy-tailed  loss  distributions.  Parameter  K,  in  turn, 
corresponds  to  the  number  of  counties  to  be  selected,  thus,  the  larger  its  value  is,  the  more  opportunities 
for  diversification  are  available  for  the  decision-maker,  which,  in  turn,  allows  for  risk  reduction. 

To  sum  up,  the  results  of  this  case  study  suggest  that  under  certain  conditions,  such  as  heavy-tailed 
loss  distribution,  relatively  poor  diversification  opportunities,  and  sufficiently  large  testing  horizon,  risk- 
averse  decision  strategies  based  on  the  introduced  log-exponential  convex  measures  of  risk  can  substan¬ 
tially  outperform  strategies  based  on  linear  risk  measures,  such  as  the  Conditional  Value-at-Risk. 

3.2  Case  Study  2:  Portfolio  Optimization 

As  heavy-tailed  loss  distributions  are  often  found  in  financial  data,  we  conducted  numerical  experiments 
with  historical  stock  market  data  as  the  second  paid  of  the  case  study. 


Model  description  As  the  underlying  decision  making  model  we  use  the  traditional  risk-reward  port¬ 
folio  optimization  framework  introduced  by  Markowitz  (1952).  In  this  setting,  the  cost/loss  outcome  X 
is  usually  defined  as  the  portfolio’s  negative  rate  of  return,  X(x,  to)  =  — r(cv)Tx,  where  x  stands  for  the 
vector  of  portfolio  weights,  and  r  =  r(to)  is  the  uncertain  vector  of  assets’  returns.  Then,  a  portfolio 
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allocation  problem  can  be  formulated  as  the  problem  of  minimizing  some  measure  of  risk  associated 
with  the  portfolio  while  maintaining  a  prescribed  expected  return: 


min  <p(— rTx) 
xeM'l  1 


E(rTx)  >  r,  lTx  <  1  J, 


(12) 


where  r  is  the  prescribed  level  of  expected  return,  x  e  R”  denotes  the  no- short- selling  requirement,  and 
1  =  (1. . . . ,  1)T.  If  the  risk  measure  used  is  convex,  it  is  easy  to  see  that  (12)  is  a  convex  optimization 
problem.  In  this  case  study,  we  again  select  p  in  (12)  as  either  a  LogExpCR  or  CVaR  measure. 


Dataset  description  We  utilized  historical  stock  market  data  available  through  Yahoo! Finance.  We 
picked  2178  listings  traded  at  NYSE  from  March.  2000  through  December,  2012  (total  of  3223  trading 
days).  As  it  was  noted  above,  financial  data  often  exhibit  highly  volatile  behavior,  especially  higher- 
frequency  data,  while  long-term  data  is  usually  relatively  normal.  In  order  to  account  for  such  differences, 
we  generated  three  types  of  datasets  of  loss  distribution,  which  were  based  on  two-day,  two-week  and 
one-month  historical  returns.  Particularly,  if  pj  j  is  the  historical  close  price  of  asset  i  on  day  j ,  then  we 
define  the  corresponding  two-day,  ten-day,  and  one-month  returns  as  rjj  —  Pl  Jp.  P'"'A~A »  where  A  takes 
values  A  =  2,  10,  and  20,  respectively. 


Implementation  details  We  utilize  a  training-testing  framework  similar  to  the  one  used  in  the  previous 
section,  but  additionally,  we  also  employ  “rolling  horizon”  approach,  which  aims  to  simulate  a  real-life 
self-financing  trading  strategy.  For  a  given  time  moment,  we  generate  a  scenario  set  containing,  respec¬ 
tively,  m  two-day,  ten-day,  and  one-month  returns  immediately  preceding  this  date.  Then,  the  portfolio 
optimization  problem  (12)  is  solved  for  each  type  of  scenario  set  in  order  to  obtain  the  corresponding 
optimal  portfolios;  the  “realized”  portfolio  return  over  the  next  two-day,  ten-day,  or  one-month  time  pe¬ 
riod,  respectively,  is  then  observed.  The  portfolio  is  then  rebalanced  using  the  described  procedure.  This 
rolling-horizon  procedure  was  ran  for  800  days,  or  about  3  years. 

Recall  that  parameter  r  in  (12)  represents  the  “target  return”,  i.e.,  the  minimal  average  return  of  the 
portfolio.  For  our  purposes  parameter  r  was  selected  as  r  =  x  max*  {E^r,  (&>)},  i.e.,  as  a  certain  per¬ 
centage  of  the  maximum  expected  return  previously  observed  in  the  market  (within  the  timespan  of  the 
current  scenario  set).  Parameter  r  has  been  set  to  be  “low”,  “moderate”,  or  “high”,  which  corresponds 
to  t  =  0.1, 0.5,  0.8.  For  each  pair  of  n  and  m  we  repeat  the  experiment  20  times,  selecting  n  stocks 
randomly  each  time.  The  parameters  CTLogExpCR-  ®cvaR>  and  A  have  been  assigned  the  same  values  as  in 
Case  Study  1. 


Discussion  of  results  Obtained  results  are  summarized  in  Table  3,  and  a  typical  behavior  of  the  port¬ 
folio  value  over  time  is  presented  in  Figure  1 .  As  in  the  previous  case,  we  report  relative  difference  in 
the  return  over  appropriate  time  period  (2-day,  2- week,  or  1 -month)  averaged  over  the  testing  horizon  of 
800  days  and  over  20  random  choices  of  n  assets.  Note  that  since  in  this  case  the  quality  of  the  decision 
is  estimated  in  terms  of  rate  of  return,  i.e.,  gain,  positive  values  in  Table  3  correspond  to  the  cases  when 
the  FogExpCR-based  portfolio  outperforms  the  CVaR-based  portfolio. 

Similarly  to  the  previous  case,  we  can  observe  that  the  behavior  of  the  tails  of  the  distribution  plays  a  key 
role  in  the  comparison:  under  1-month  trading  frequency  the  differences  between  CVaR  and  FogExpCR 
portfolios  are  relatively  insignificant,  compared  to  the  2-day  case.  Moreover,  we  can  again  conclude  that 
for  heavy-tailed  loss  distributions  the  introduced  FogExpCR  measure  may  compare  favorably  against 
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CVaR;  in  particular,  conditions  of  restricted  diversification  options  (relatively  small  value  of  n)  make 
utilization  of  LogExpCR  measures  more  beneficial  compared  to  a  linear  measure  such  as  CVaR. 


4  Concluding  Remarks 

In  this  paper  we  introduced  a  general  representation  of  the  classes  of  convex  and  coherent  risk  measures 
by  showing  that  any  convex  (coherent)  measure  can  be  defined  as  an  infimal  convolution  of  the  form 
p(X )  =  min^  i]  +  cj)(X  —  rj),  where  <fi  is  monotone,  convex,  and  (j> ( >] )  >  rj  for  all  rj  ^  0,  </>(())  =  0  (and 
positive  homogeneous  for  coherence),  and  vice  versa,  constructed  in  such  a  way  function  p  is  convex 
(coherent).  Another  way  to  look  at  this  result  is  to  observe  that  a  monotone  and  convex  f  only  lacks 
translation  invariance  in  order  to  satisfy  the  definition  of  a  convex  risk  measure,  and  infimal  convolution 
operator  essentially  forges  this  additional  property,  while  preserving  monotonicity  and  convexity.  Ac¬ 
cording  to  this  scheme,  a  risk  measure  is  represented  as  a  solution  of  an  optimization  problem,  hence  it 
can  be  readily  embedded  in  a  stochastic  programming  model. 

Secondly,  we  apply  the  developed  representation  to  construct  risk  measures  as  infimal  convolutions  of 
certainty  equivalents,  which  allows  for  a  direct  incorporation  of  risk  preferences  as  given  by  the  utility 
theory  of  von  Neumann  and  Morgenstern  (1944)  into  a  convex  or  coherent  measure  of  risk.  This  is  highly 
desirable  since,  in  general,  the  risk  preferences  induced  by  convex  or  coherent  measures  of  risk  arc  incon¬ 
sistent  with  risk  preferences  of  rational  expected-utility  maximizers.  It  is  also  shown  that  the  certainty 
equivalent-based  measures  of  risk  arc  “naturally”  consistent  with  stochastic  dominance  orderings. 

Finally,  we  employ  the  proposed  scheme  to  introduce  a  new  family  of  risk  measures,  which  we  call 
the  family  of  log-exponential  convex  risk  measures.  By  construction,  LogExpCR  measures  quantify 
risk  by  placing  emphasis  on  extreme  or  catastrophic  losses;  also,  the  LogExpCR  measures  have  been 
shown  to  be  isotonic  (consistent)  with  respect  to  stochastic  dominance  of  arbitrary  order.  The  results 
of  the  conducted  case  study  show  that  in  highly  risky  environments  characterized  by  heavy-tailed  loss 
distribution  and  limited  diversification  opportunities,  utilization  of  the  proposed  LogExpCR  measures 
can  lead  to  improved  results  comparing  to  the  standard  approaches,  such  as  those  based  on  the  well- 
known  Conditional  Value-at-Risk  measure. 
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Table  1:  Relative  difference  (in  %)  in  total  loss  y  =  ^{^LogExpc/^cv^R y  for  NFIP  data  for  various  values  of  the 
parameters  K  and  m.  Entries  in  bold  correspond  to  the  instances  for  which  LogExpCR  measure  outperformed 
CVaR. 


K\m 

20 

60 

100 

140 

180 

220 

260 

300 

1 

-45038.3 

-3944.4 

-4652.2 

-3663.7 

-3663.7 

-220.2 

-220.2 

-220.2 

3 

-1983.7 

-971.0 

-211.5 

-146.7 

-146.7 

-68.4 

0.0 

0.0 

5 

-1284.2 

-464.2 

-85.7 

-13.1 

-13.1 

0.0 

-6.6 

0.0 

7 

-853.9 

-342.5 

0.0 

0.0 

0.0 

-0.4 

-4.5 

-13.1 

9 

-387.1 

-282.9 

0.0 

0.0 

0.0 

0.0 

-3.1 

10.8 

11 

-369.9 

-181.0 

0.0 

-18.2 

14.0 

-27.8 

5.5 

-2.2 

13 

-360.4 

-33.5 

0.0 

-13.0 

1.0 

4.3 

0.0 

41.0 

15 

-353.9 

-27.9 

-3.2 

3.1 

0.0 

-3.6 

4.8 

20.6 

17 

-129.8 

-1.1 

-0.2 

3.7 

-26.5 

11.5 

25.4 

25.4 

19 

-66.3 

21.6 

0.9 

0.0 

0.0 

-2042.1 

35.4 

23.1 

21 

-64.0 

5.0 

0.0 

-279.0 

2.5 

-1.6 

35.0 

8.0 

23 

-57.4 

4.8 

0.0 

0.7 

-65.6 

-0.1 

20.3 

81.8 

25 

-49.5 

0.0 

-82.4 

0.0 

-39.2 

4.4 

76.9 

84.7 

27 

-48.2 

0.0 

-52.2 

0.0 

4.7 

4.1 

68.7 

84.1 

29 

-47.0 

-34.3 

33.0 

-254.3 

-463.8 

4.0 

81.8 

83.5 

31 

-41.1 

-31.2 

8.7 

-218.4 

-309.8 

8.5 

79.3 

83.7 

33 

-10.6 

46.4 

-10.0 

-162.7 

-161.7 

8.9 

19.6 

84.6 

35 

-9.5 

0.0 

-12.2 

-142.9 

-153.1 

37.8 

53.9 

47.6 

37 

-7.7 

12.0 

-81.7 

5.3 

2.7 

57.0 

15.0 

9.9 

39 

0.0 

5.3 

-102.8 

45.8 

45.4 

43.4 

8.6 

5.6 

41 

0.0 

11.4 

-77.3 

30.9 

43.8 

34.8 

20.7 

4.8 

43 

0.0 

-13.4 

-11.0 

53.8 

4.0 

50.0 

19.8 

-3.4 

45 

0.0 

0.0 

-28.1 

54.5 

-36.1 

26.2 

14.2 

8.5 

47 

-9.1 

9.0 

4.5 

19.4 

6.4 

17.5 

28.2 

-2.2 

49 

0.0 

6.4 

27.8 

-3.5 

-20.1 

7.0 

0.5 

-8.5 

51 

0.0 

5.1 

49.6 

-2.6 

1.0 

-9.5 

-5.0 

2.1 

53 

39.9 

-16.2 

24.7 

28.6 

23.0 

2.9 

-4.9 

-137.2 

55 

-7.1 

-8.7 

28.0 

21.6 

-0.9 

5.6 

-83.5 

-19.1 

57 

0.0 

3.0 

28.5 

3.6 

4.1 

2.6 

-9.1 

-9.2 

59 

0.0 

20.0 

9.0 

2.0 

-0.6 

0.1 

25.3 

27.4 

19 
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Table  2:  Relative  difference  (in  %)  in  total  loss  y  = 


LogExpCR C  VaR 

min{LLosExPCR,LCVaR} 


for  normal  data  for  various  values  of  the 


parameters  K  and  m.  Entries  in  bold  correspond  to  the  instances  for  which  LogExpCR  measure  outperformed 
CVaR. 


K\m 

20 

60 

100 

140 

180 

220 

260 

300 

1 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

3 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

5 

0.0 

2.2 

-107.1 

0.0 

0.0 

-17.6 

22.8 

-58.1 

7 

0.0 

14.0 

0.0 

85.3 

28.8 

84.8 

86.4 

21.9 

9 

0.0 

14.2 

27.2 

11.0 

2.9 

0.0 

0.0 

0.0 

11 

17.2 

12.9 

0.0 

34.8 

33.9 

66.8 

36.0 

-50.4 

13 

19.2 

16.6 

-3.3 

-11.1 

2.6 

18.0 

-12.3 

0.0 

15 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

17 

0.0 

-3.4 

34.0 

0.0 

0.0 

312.1 

0.0 

-355.0 

19 

43.0 

-8.2 

8.9 

52.3 

76.7 

-65.9 

-20.3 

0.0 

21 

0.0 

4.3 

21.5 

-45.4 

506.1 

-123.1 

-119.2 

1.9 

23 

27.3 

-32.1 

48.8 

75.2 

242.3 

-63.1 

3.0 

-112.1 

25 

-317.3 

-8.0 

16.9 

74.7 

151.1 

-71.3 

-129.0 

-64.7 

27 

9.7 

-34.1 

31.1 

-50.8 

96.3 

154.3 

163.8 

-16.8 

29 

7.4 

13.7 

19.4 

78.4 

44.6 

272.6 

-15.3 

-31.4 

31 

1.8 

10.3 

5.3 

6.4 

52.6 

234.0 

44.8 

-5.5 

33 

10.5 

-14.7 

-15.2 

-31.2 

-32.8 

11.5 

-15.0 

10.1 

35 

9.1 

6.0 

0.0 

0.0 

36.8 

36.9 

437.2 

0.0 

37 

5.1 

-1.0 

0.0 

18.0 

39.1 

20.8 

119.3 

0.0 

39 

0.0 

-0.8 

-1.4 

10.3 

13.2 

-14.6 

-109.7 

73.6 

41 

19.8 

18.3 

0.0 

24.7 

22.1 

0.0 

44.0 

762.8 

43 

7.6 

8.7 

6.4 

0.0 

-6.1 

0.0 

0.0 

0.0 

45 

6.9 

5.9 

11.4 

7.9 

6.1 

16.9 

-20.6 

-99.3 

47 

0.0 

1.1 

16.6 

4.0 

13.0 

0.0 

21.5 

46.7 

49 

-2.8 

22.5 

17.7 

-7.5 

-11.2 

-2.3 

0.0 

-294.8 

51 

0.0 

5.1 

17.8 

5.0 

10.4 

-28.4 

-0.1 

-47.4 

53 

-1.1 

0.0 

-6.7 

-0.5 

25.4 

0.0 

0.0 

-39.7 

55 

6.8 

0.0 

17.5 

18.3 

0.0 

-9.3 

37.8 

-87.4 

57 

1.3 

0.0 

0.0 

-14.5 

-21.8 

0.0 

0.0 

0.0 

59 

6.3 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

-10.0 

20 
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Table  3:  Relative  difference  (in  %)  in  average  portfolio  return  due  to  LogExpCR  measure  and  CVaR.  Parameter 
n  represents  the  total  number  of  assets  on  the  market,  m  is  the  number  of  time  intervals  in  the  training  horizon, 
r  defines  the  prescribed  expected  rate  of  return  as  the  percentage  of  the  maximum  expected  return  previously 
observed  in  the  market.  Labels  “2-day”,  “2-week”,  and  “1 -month”  correspond  to  portfolio  rebalancing  periods. 


n 

m 

T 

2-day 

2-week 

1 -month 

20 

2000 

0.1 

57.3 

29.5 

8.3 

0.5 

138.3 

1.1 

-12.9 

0.8 

5.9 

-24.1 

-7.4 

200 

2000 

0.1 

-17.9 

-14.6 

-2.2 

0.5 

11.1 

-21.1 

5.4 

0.8 

17.6 

-13.5 

-2.2 

Figure  1:  Typical  behavior  of  portfolio  value,  as  a  multiple  of  the  initial  investment  (1.0),  over  time. 
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Abstract 

We  present  an  efficient  scenario  decomposition  algorithm  for  solving  large-scale  convex  stochas¬ 
tic  programming  problems  that  involve  a  particular  class  of  downside  risk  measures.  The  considered 
risk  functionals  encompass  coherent  and  convex  measures  of  risk  that  can  be  represented  as  an  infimal 
convolution  of  a  convex  certainty  equivalent,  and  include  well-known  measures,  such  as  conditional 
value-at-risk,  as  special  cases.  The  resulting  structure  of  the  feasible  set  is  then  exploited  via  iterative 
solving  of  relaxed  problems,  and  it  is  shown  that  the  number  of  iterations  is  bounded  by  a  parameter 
that  depends  on  the  problem  size.  The  computational  performance  of  the  developed  scenario  decom¬ 
position  method  is  illustrated  on  portfolio  optimization  problems  involving  two  families  of  nonlinear 
measures  of  risk,  the  higher-moment  coherent  risk  measures  and  log-exponential  convex  risk  mea¬ 
sures.  It  is  demonstrated  that  for  large-scale  nonlinear  problems  the  proposed  approach  can  provide 
up  to  an  order  of  magnitude  of  improvement  in  computational  time  in  comparison  to  state-of-the-art 
solvers,  such  as  CPLEX,  Gurobi,  and  MOSEK. 

Keywords:  Stochastic  optimization,  risk  measures,  utility  theory,  certainty  equivalent,  scenario  de¬ 
composition,  higher  moment  coherent  risk  measures,  log-exponential  convex  risk  measures. 


1  Introduction  and  Motivation 

Quantification  of  uncertainties  and  risk  via  axiomatically  defined  statistical  functionals,  such  as  the  co¬ 
herent  measures  of  risk  of  Artzner  et  al.  (1999),  has  become  a  widely  accepted  practice  in  stochastic 
optimization  and  decision  making  under  uncertainty  (Shapiro  et  ah,  2009;  Krokhmal  et  al.,  2011;  Urya- 
sev  and  Rockafellar,  2013).  Many  of  such  risk  measures  admit  effective  utilization  in  “scenario-based” 
formulations  of  stochastic  programming  models,  i.e.,  the  stochastic  optimization  problems  where  the 
random  parameters  are  assumed  to  have  a  known  distribution  over  a  finite  support  that  is  commonly 
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called  the  scenario  set.  A  typical  instance  of  such  a  problem  can  be  written  as 

min  p(X(x,a))),  (1) 

where  p  is  the  risk  measure,  X(\.  a>)  represents  a  stochastic  loss  or  cost  function  dependent  on  the 
decision  vector  x  e  C  C  M”  and  a  random  event  a>  from  the  finite  set  U  =  { to  \ , . . . ,  ojn  }.  In  many 
practical  applications  accurate  approximations  of  uncertainties  may,  however,  require  very  large  scenario 
sets  (N  1),  thus  potentially  leading  to  substantial  computational  difficulties. 

In  this  work,  we  propose  an  efficient  algorithm  for  solving  large-scale  stochastic  optimization  problems 
involving  a  class  of  “downside”,  or  “tail”  risk  measures  that  arc  constructed  via  certainty  equivalents, 
a  well  known  concept  in  the  utility  theory.  The  presented  scenario  decomposition  algorithm  exploits 
the  special  structure  of  the  feasible  set  induced  by  the  respective  risk  measures  as  well  as  the  properties 
common  to  the  considered  class  of  risk  functionals.  As  an  illustrative  example  of  the  general  approach, 
we  consider  stochastic  optimization  problems  with  higher-moment  coherent  risk  measures  (HMCR), 
which  quantify  risk  via  higher  moments  of  cost  or  loss  distributions  (Krokhmal,  2007),  making  them 
advantageous  in  the  presence  of  “heavy-tailed”  uncertainty.  We  also  apply  the  proposed  method  to 
problems  with  log-exponential  convex  risk  (LogExpCR)  measures  (Vinel  and  Krokhmal,  2015). 

Perhaps,  the  most  frequently  implemented  risk  measure  in  problems  of  type  (1)  is  the  well  known  Con¬ 
ditional  Value-at-Risk  (CVaR)  (Rockafellar  and  Uryasev,  2000,  2002).  When  X  is  piecewise  linear  in 
x  and  set  C  is  polyhedral,  formulation  (1)  with  CVaR  objective  or  constraints  reduces  to  a  linear  pro¬ 
gramming  (LP)  problem.  Several  recent  studies  addressed  the  solution  efficiency  of  LPs  with  CVaR 
objectives  or  constraints  for  cases  when  the  number  of  scenarios  is  large.  Lim,  Sherali,  and  Uryasev 
(2010)  noted  that  (1)  in  this  case  may  be  viewed  as  a  nondifferentiable  optimization  problem  and  im¬ 
plemented  a  two-phase  solution  approach  to  solve  large-scale  instances.  In  the  first  phase,  they  exploit 
descent-based  optimization  techniques  to  circumvent  nondifferentiable  points  by  perturbing  the  solution 
to  differentiable  solutions  within  their  “relative  neighborhood”.  The  second  phase  employs  a  deflecting 
subgradient  search  direction  with  a  step  size  established  by  an  adequate  target  value.  They  further  ex¬ 
tended  this  approach  with  a  third  phase  that  resorts  to  the  simplex  algorithm  after  achieving  convergence 
by  employing  an  advanced  crash-basis  dependent  on  solutions  obtained  from  the  first  two  phases. 

Kunzi-Bay  and  Mayer  (2006)  developed  a  solution  technique  for  problem  (1),  with  measure  p  chosen 
as  the  CVaR,  that  utilized  a  specialized  L-shaped  method  after  reformulating  it  as  a  two-stage  stochastic 
programming  problem.  However,  Subramanian  and  Huang  (2008)  noted  that  the  problem  structure  does 
not  naturally  conform  to  the  characteristics  of  a  two-stage  stochastic  program  and  introduced  a  polyhedral 
reformulation  of  the  CVaR  constraint  with  a  statistics  based  CVaR  estimator  to  solve  a  closely  related 
version  of  the  problem.  In  a  followup  study  (Subramanian  and  Huang,  2009),  they  retained  Value-at-Risk 
(VaR)  and  CVaR  as  unknown  variables  in  the  CVaR  constraints,  enabling  a  more  efficient  decomposition 
algorithm,  as  opposed  to  Klein  Haneveld  and  van  der  Vlerk  (2006),  where  the  problem  was  solved  as  a 
canonical  integrated  chance  constraint  problem  with  preceding  estimates  of  VaR.  Espinoza  and  Moreno 
(2012)  proposed  a  solution  method  for  problems  (1)  with  CVaR  measures  that  entailed  generation  of 
aggregated  scenario  constraints  to  form  smaller  relaxation  problems  whose  optimal  outcomes  were  then 
used  to  directly  evaluate  the  respective  upper  bound  on  the  objective  of  the  original  problem. 

In  what  follows,  we  develop  a  general  scenario  decomposition  solution  framework  for  solving  stochas¬ 
tic  optimization  problems  with  certainty  equivalent-based  risk  measures  by  utilizing  principles  related 
to  those  in  Espinoza  and  Moreno  (2012).  The  rest  of  the  paper  is  organized  as  follows:  A  class  of 
certainty  equivalent-based  risk  measures  that  are  in  the  focus  of  this  study  and  their  implementation  in 
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mathematical  programming  problems  arc  discussed  in  Section  2.  In  Section  3  we  propose  the  scenario 
decomposition  algorithm  for  stochastic  programming  problems  with  structure  that  is  induced  by  the  risk 
measures  described  in  Section  2.  Lastly,  experimental  studies  on  portfolio  optimization  problems  with 
large-scale  data  sets  that  demonstrate  the  effectiveness  of  the  developed  technique  are  presented  in  Sec¬ 
tion  4  . 

2  A  Class  of  Downside  Risk  Measures  Based  on  Certainty  Equivalents 

In  this  section  we  describe  a  class  of  risk  measures  that  encompasses  some  popular  instances  in  risk 
management  literature.  A  general  solution  algorithm  that  utilizes  special  properties  of  this  class  of  mea¬ 
sures  will  be  presented  in  the  sequel.  Specifically,  this  algorithm  applies  to  the  so-called  coherent  and 
convex  measures  of  risk  that  can  be  represented  as  an  infimal  convolution  of  certainty  equivalent  of  some 
utility  function.  Below  we  recall  the  definitions  of  coherent  and  convex  risk  measures  and  describe  the 
representation  that  motivated  the  present  development. 

In  general,  a  risk  measure  p{X)  over  a  random  outcome  (specifically,  a  cost  or  a  loss)  X  from  probability 
space  (£2,  T ,  P)  is  defined  as  a  lower  semi-continuous  (l.s.c.)  mapping  p  :  X  i->  R,  with  X  being  the 
space  of  bounded  ^-measurable  functions  X  :  £2  i->  R.  In  order  to  avoid  an  excessively  technical  dis¬ 
cussion,  we  will  implicitly  assume  that  X  is  endowed  with  the  properties  necessary  in  the  given  context 
(e.g.,  integrability,  and  so  on).  Additional  properties  of  p  are  introduced  to  make  the  corresponding  risk 
measure  well-suited  for  a  specific  application  area. 

Artzner  et  al.  (1999)  and  Delbaen  (2002)  proposed  the  following  four  axioms  as  the  desirable  character¬ 
istics  that  a  “good”,  or  coherent  measure  of  risk  should  possess: 

(Al)  monotonicity:  p(X )  <  p{Y)  for  all  X,YeX  such  that  X  <  Y ; 

(A2)  convexity:  p{ XX  +  (1  —  A )Y)  <  Xp(X)  +  (1  —  X)p(Y)  for  all  X,  Y  e  X  and  0  <  A  <  1; 

(A3)  positive  homogeneity:  p( XX)  =  A p{X)  for  all  X  €  X  and  X  >  0; 

(A4)  translation  invariance:  p(X  +  a)  =  p(X )  +  a  for  all  X  €  X  and  a  €  M. 

The  following  interpretations  may  be  given  to  the  above  axioms:  Axiom  (Al)  ensures  that  smaller  losses 
lead  to  lower  risk.  From  the  risk  management  point  of  view,  the  convexity  axiom  (A2)  promotes  risk  re¬ 
duction  via  diversification;  it  is  also  of  fundamental  importance  in  the  optimization  context.  The  positive 
homogeneity  property  (A3)  postulates  that  scaling  losses  by  a  positive  factor  scales  risk  correspondingly. 
Axiom  (A4)  allows  for  eliminating  risk  of  an  uncertain  cost/loss  profile  X  by  adding  a  deterministic 
hedge,  p(X  -  p(X))  -  0. 

Since  being  proposed  in  Artzner  et  al.  (1999)  and  Delbaen  (2002),  the  axiomatic  approach  to  defining  risk 
measures  has  been  widely  adopted  in  literature,  and  a  number  of  risk  functionals  tailored  to  particular 
preferences  emerged  thereafter  (see,  e.g.,  Krokhmal  et  al.,  2011;  Uryasev  and  Rockafellar,  2013).  In 
particular,  it  has  been  argued  that  the  positive  homogeneity  property  (A3)  may  be  omitted  in  many 
situations;  the  corresponding  risk  measures  that  satisfy  axioms  (Al),  (A2),  and  (A4)  are  called  convex 
measures  of  risk  (Ruszczyriski  and  Shapiro,  2006). 

Our  interest  in  these  two  classes  of  risk  measures  stems  from  the  following  infimal  convolution  represen¬ 
tation  that  facilitates  their  use  in  mathematical  programming  problems. 
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Theorem  1  (Krokhmal,  2007 ;  Vinel  and  Krokhmal,  2014a)  Function  p(X)  is  a  proper  coherent 
(resp.,  convex )  measure  of  risk  if  and  only  if  it  can  be  represented  by  the  following  infimal  convolu¬ 
tion  of  a  l.s.c.  function  </>  :  X  i->  R  such  that  0(0)  =  0,  4>(rj)  >  q  for  all  real  rj  ^  0,  and  which  satisfies 
(A1)-(A3)  (resp.,  (A1)-(A2)): 

p(X)  =  inf  rj  +  </>(X  -  rj).  (2) 

v 

Moreover,  the  infimum  in  (2)  is  attained  for  all  X,  so  inf^  may  be  replaced  by  mirq;eiR- 

Representation  (2)  can  be  used  for  construction  of  coherent  (convex)  risk  measures  through  an  appropri¬ 
ate  choice  of  function  (j>.  The  present  work  concerns  risk  measures  of  type  (2)  that  can  directly  incorpo¬ 
rate  decision  maker’s  risk  preferences  as  given  by  the  utility  theory  of  von  Neumann  and  Morgenstern 
(1944).  This  is  desirable  in  view  of  the  well-known  fact  (see,  e.g.,  Schied  and  Follmer,  2002)  that  risk 
preferences  expressed  by  coherent/convex  measures  of  risk  arc  generally  not  compatible  with  rational 
risk-averse  preferences  (i.e.,  those  defined  by  a  non-decreasing  concave  utility  function  u). 

Given  that  we  operate  with  stochastic  cost/loss  variables,  let  v(t)  =  — u(—t )  be  the  utility  function 
adapted  to  loss  variable  X,  or  deutility  function  that  quantities  dissatisfaction  with  cost  or  loss  X.  Then, 
CE(V)  =  u_1  (Ev(X ))  represents  the  certainty  equivalent  (CE)  of  loss  X,  i.e.,  such  a  deterministic 
loss  that  a  rational  decision  maker  with  deutility  function  v  would  be  indifferent  between  CE(A)  and 
stochastic  loss  profile  X.  The  following  argument  can  be  used  to  construct  risk  measures  of  the  form  (2) 
that  employ  rational  utility  maximizer’s  preferences  via  certainty  equivalents  (Vinel  and  Krokhmal,  2015, 
see  also  Ben-Tal  and  Teboulle,  2007).  Consider  a  decision  maker  who  faces  an  uncertain  future  loss  X, 
but  who  can  allocate  an  amount  rj  of  resources  now  to  cover  the  future  loss.  It  will  cost  v~lEv(X  —  q)+ 
to  cover  the  remaining  losses  ( X  —  q)+,  where  /+  =  max{0,  t}  and  an  operator-like  notation  is  used  for 
v,  i.e.,  u-1Ei>(V  —  r})+  =  v~1(Ev((X  —  ?])+)).  The  total  cost  can  then  be  optimized  with  an  appropriate 
choice  of  rj,  such  that  the  risk  p(X )  of  a  future  loss  X  reduces  to 

p(X)  —  min  rj  4 - u_1Ei>(V  —  rf)+,  a  e  (0, 1),  (3) 

v  1  —  Of 

where  (1  —  a)-1  >  1  is  a  penalty  factor  (a  detailed  discussion  of  representation  (3)  and  related  aspects 
is  presented  in  Vinel  and  Krokhmal,  2014a). 

Notably,  expressing  fi(X)  in  (2)  via  certainty  equivalents  necessarily  requires  that  (•)+  appeal's  in  (3)  in 
order  for  <p(X)  to  conform  to  the  conditions  of  Theorem  1  (Vinel  and  Krokhmal,  2014a).  The  conditions 
on  v  that  guarantee  convexity  of  CE(X)  =  w_1  Ev(X ),  and,  correspondingly,  of  (/)(X),  can  be  found, 
for  example,  in  Ben-Tal  and  Teboulle  (2007):  v  should  be  three  times  continuously  differentiable,  and 
v'(t)/v"(t)  be  convex.  In  what  follows,  we  implicitly  assume  that  f(X )  —  (1  —  a)_1t;_1Ei;(V+)  is 
convex  and  satisfies  the  conditions  of  Theorem  1 : 

(Ul)  Function  i ft)  is  continuously  differentiable,  increasing,  convex,  and,  moreover,  such  that  i>(0)  =  0 
and  the  certainty  equivalent  u_1Ei;(V)  is  convex  in  X. 

A  key  property  of  risk  measures  (3)  is  isotonicity  with  respect  to  second  order  stochastic  dominance 
(SSD),  provided  that  deutility  function  v  is  convex  and  nondecreasing: 

(A5)  SSD  isotonicity:  p(X)  <  p(Y )  for  all  X.Ye  A’  such  that  (—X)  ^ssd  (—  Y). 

Recall  that  payoff  profile  Y\  dominates  Y2  with  respect  to  SSD,  Y\  ^ssd  I2,  if  and  only  if  Ew(  T| )  > 
Eu(Y2)  holds  for  all  non-decreasing  concave  utility  functions  u,  or,  in  other  words,  if  every  rational  risk- 
averse  decision  maker  prefers  Y\  over  Y2.  In  this  regal'd,  (A5)  implies  that  risk  measures  (3)  “inherit”  the 
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risk  preferences  given  by  the  utility  u  (equivalently,  v).  It  is  important  to  note  that  coherent  and  convex 
measures  of  risk  arc  generally  not  SSD-isotonic  (Krokhmal  et  ah,  2011). 

Another  common  property  of  risk  measures  (3)  is  that  they  arc  “tail”  risk  measures  in  the  sense  that  the 
tail  {X  :  X  >  rj*(X)}  of  the  loss  distribution  is  used  to  quantify  risk,  where  the  location  of  the  “tail 
cutoff”  point  t]*(X),  which  is  a  minimizer  in  (3),  can  be  adjusted  according  to  risk  preferences  via  the 
parameter  a  (see  Krokhmal,  2007;  Vinel  and  Krokhmal,  2014a). 

Several  practical  and  interesting  risk  measure  families  can  be  obtained  from  (3)  by  selecting  a  specific 
deutility  function  v.  If  v(t)  —  t,  then  (3)  defines  the  well-known  Conditional  Value-at-Risk  measure 
(Rockafellar  and  Uryasev,  2002,  2000): 

CVaRa(X)  =  min  t]  +  (1  -  a)_1E(A  -rj)+,  a  €  (0,  1).  (4) 

n 

I f  v (l )  =  l p  for  t  >  0  and  p  >  1,  then  representation  (3)  yields  a  two-parametric  family  of  higher- 
moment  coherent  risk  measures  (HMCR)  (Krokhmal,  2007): 

HMCRn  a(X)  —  min  r/  +  (1  —  o:)_1||(A  —  ?/)+|L,  a  e  (0, 1),  p>  1,  (5) 

v 

where  \\X\\P  =  {E.\X\P)X^ p .  If  v(t)  =  X‘  —  1,  A  >  1,  then  one  obtains  the  family  of  log-exponential 
convex  measures  of  risk  (Vinel  and  Krokhmal.  2015): 

LogExpCRAa(V)  =  min  t]  +  (1  -  cr)_1  logA  EA(Z_?7)+,  a  6  (0, 1),  A  >  1.  (6) 

’  v 

Unlike  the  CVaR  and  HMCR  measures  that  arc  coherent,  the  LogExpCR  measure  is  convex  but  not 
coherent  as  it  does  not  satisfy  the  positive  homogeneity  axiom  (A3). 

Perhaps  one  of  the  most  widely  used  coherent  measures  of  risk  is  defined  by  (4),  which  represents, 
roughly  speaking,  the  conditional  expectation  of  losses  that  may  occur  in  the  ( 1  —  a)  ■  1 00%  of  worst  real¬ 
izations  of  X.  Clearly,  CVaR  measure  is  a  special  case  of  (5)  when  p  =  1,  HMCRqafA)  =  CVaRafA). 
When  p  >  1,  HMCR  measures  quantify  risk  via  higher  tail  moments  ||(A  —  rj)+  ||p,  and  have  been 
shown  to  be  better  suited  for  applications  that  involve  heavy  tailed  loss  distributions  (Krokhmal,  2007). 
Likewise,  the  LogExpCR  family  (6)  is  designed  for  dealing  with  heavy-tailed  distributions;  moreover,  in 
addition  to  being  SSD-isotonic,  LogExpCR  measures  arc  isotonic  with  respect  to  stochastic  dominance 
of  arbitrary  order  (USD),  see  Vinel  and  Krokhmal  (2015). 

Next  we  discuss  the  implementation  of  the  risk  measures  discussed  above  in  mathematical  programming 
problems. 

2.1  Implementation  in  Stochastic  Programming 

Assume  that  loss  A  is  a  function  of  the  decision  variable  x,  X  =  X(x,co),  where  a>  e  Q.  Then,  for 
a  compact  and  convex  feasible  set  C  C  M™,  consider  a  stochastic  programming  problem  with  a  risk 
constraint  in  the  form 


min  |g(x)  :  p(X(x,  &>))  <  h(x),  xeC}.  (7) 

Theorem  2  Consider  problem  (7)  where  set  C  C  M”  is  compact  and  convex,  and  functions  g(x)  and 
h  (x)  are  convex  and  concave  on  C,  respectively.  If  further,  the  cost  or  loss  function  X(x,  a>)  is  convex  in 
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x,  and  p  is  a  coherent  or  convex  measure  of  risk  with  representation  (2),  then  problem  (7)  is  equivalent 
to 

min  |g(x)  :  rj  +  <p(X(x,  co)  —  r/)  <  h(x),  (x,  rj)  €  C  x  M},  (8) 

in  the  sense  that  (7)  and  (8)  achieve  minima  at  the  same  values  of  the  decision  variable  x  and  their 
optimal  objective  values  coincide.  Further,  if  the  risk  constraint  in  (7)  is  binding  at  optimality,  (x*,  rj*) 
achieves  the  minimum  of  (8)  if  and  only  ifx*  is  an  optimal  solution  of  (7)  and 

rj *  e  arg min^  rj  +  <p(X(x*,co)  —  rj). 

Proof:  See  Krokhmal  (2007).  □ 


Remark  1  Note  that  the  risk  minimization  problem 

min  [p(X(x,  co))  :  x  e  C}  (9) 

is  obtained  from  (7)  by  introduction  of  a  dummy  variable  xn+\  and  letting  g(x)  =  h(x)  =  xn+\. 

Let  function  f  in  (8)  have  the  form  <p(X)  —  (1  —  a)_1n_1Et>(2(-|-).  Given  a  discrete  set  of  scenarios 

{co i . cu/v}  =  ^  diat  induce  cost  or  loss  outcomes  X(x,  co\), . . . ,  X(x,  coj y)  for  any  given  decision 

vector  x,  it  is  easy  to  see  that  the  risk  constraint  in  (8)  can  be  represented  by  the  following  set  of  inequal¬ 
ities: 


rj  +  (1  —  a)  1wo  <  h(x),  (10a) 

Ttjv(vjj)  j ,  (10b) 

Ve AT  ' 

Wj  >  X(x,a>j)  —  rj,  j  €  A f,  (10c) 

wj  >0,  j  e  AT,  (lOd) 

where  Af  denotes  the  set  of  scenario  indices,  Af  =  { 1 , . . . ,  N},  and  iij  =  P(<w/)  >  0  represent  the 
corresponding  scenario  probabilities  that  satisfy  n\  +  ■  •  ■  +  njy  =  1. 

In  the  above  discussion  it  was  shown  that  several  types  of  risk  measures  emerge  from  different  choices 
of  the  deutility  function  v.  Here  we  note  that  the  corresponding  representations  of  constraint  (10b)  in 
the  context  of  HMCR  and  LogExpCR  measures  lead  to  sufficiently  “nice”,  i.e.,  convex,  mathematical 
programming  models.  For  HMCR  measures  inequality  (10b)  becomes 


which  is  equivalent  to  a  standard  /(-order  cone  under  affine  scaling.  Noteworthy  instances  of  (11)  for 
which  readily  available  mathematical  programming  solution  methods  exist  include  p  =  1,2.  In  the 
particular  case  of  p  =  1,  which  corresponds  to  CVaR,  the  problem  reduces  to  a  linear  programming 
(LP)  model.  For  instances  when  p  =  2,  a  second-order  cone  programming  (SOCP)  model  that  is 
efficiently  solvable  using  long-step  self-dual  interior  point  methods  transpires.  However,  no  similarly 
efficient  solution  methods  exist  for  solving  /(-order  conic  constrained  problems  when  p  e  ( 1 , 2)  U  (2,  oo) 
due  to  the  fact  that  the  p- cone  is  not  self-dual  in  this  case.  Additional  discussion  and  computational 
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considerations  for  such  instances  arc  given  in  Section  4.1.  Lastly,  the  following  exponential  inequality 
corresponds  to  constraint  (10b)  when  p  is  a  LogExpCR  measure: 

wo  >  ]nJ2jeMnieW'  ’  (12) 

which  is  also  convex  and  allows  for  the  resulting  optimization  problem  to  be  solved  using  appropriate 
(e.g.,  interior  point)  methods. 


3  Scenario  Decomposition  Algorithm 

Large-scale  stochastic  optimization  models  with  CVaR  measure  (4)  and  the  corresponding  solution  al¬ 
gorithms  have  received  considerable  attention  in  the  literature.  In  this  section  we  propose  an  efficient 
scenario  decomposition  algorithm  for  solving  large-scale  mathematical  programming  problems  that  use 
certainty  equivalent-based  risk  measures  (3),  which  contain  CVaR  as  a  special  case. 

The  algorithm  relies  on  solving  a  series  of  relaxation  problems  containing  linear  combinations  of 
scenario-based  constraints  that  are  systematically  decomposed  until  an  optimal  solution  of  the  origi¬ 
nal  problem  is  found  or  the  problem  is  proven  to  be  infeasible.  Naturally,  the  core  assumption  behind 
such  a  scheme  is  that  sequential  solutions  of  smaller  relaxation  problems  can  be  achieved  within  shorter 
computation  times.  By  virtue  of  Section  2,  when  the  distribution  of  loss  function  V(x,  co)  has  a  finite  sup¬ 
port  (scenario  set)  Cl  =  {a>i . con}  with  probabilities  P (coj)  =  itj  >  0,  the  stochastic  programming 

problem  with  risk  constraint  (8)  admits  the  form 


min  g(x) 

(13a) 

S.  t.  X  €  C, 

(13b) 

rj  +  (1  —  g:)-1u;o  <  h(x), 

(13c) 

Wo  >  U_1|  ^  JTjV(Wj)  j, 

(13d) 

Wj  >  V(x,  coj)  —  t],  j  e  AT, 

(13e) 

Wj  >0,  j  e  Af, 

(130 

where  AT  =  {1, . . . ,  N}.  If  we  assume  that  function  g(x)  and  feasible  set  C  are  “nice”  in  the  sense 
that  problem  min{g(x)  :  x  e  C  j  admits  efficient  solution  methods,  then  formulation  (13)  may  present 
challenges  that  arc  two-fold.  First,  constraint  (13d)  may  need  a  specialized  solution  approach,  especially 
in  the  case  of  large  N .  Similarly,  when  N  is  large,  computational  difficulties  may  be  associated  with 
handling  the  large  number  of  constraints  (13e)-(13f).  In  this  work  we  present  an  iterative  procedure  for 
dealing  with  a  large  number  of  scenario-based  inequalities  (13e)-(13f). 

Since  the  original  problem  (13)  with  many  constraints  of  the  form  (13e)-(13f)  may  be  hard  solve,  a 
relaxation  of  (13)  can  be  constructed  by  aggregating  some  of  the  scenario  constraints.  Let  {Sjc  :  k  e  K,\ 
denote  a  partition  of  the  set  A f  of  scenario  indices  (which  we  will  simply  call  scenario  set),  i.e., 

Sfc  =  Af,  Si  n  Sj  =  0  for  all  i,  j  €  1C,  i  ^  j. 

keK, 
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The  aggregation  of  scenario  constraints  by  adding  inequalities  (13e)  within  sets  produces  the  follow¬ 
ing  master  problem: 


min  g(x) 

(14a) 

s.  t.  X  €  C, 

(14b) 

t]  +  (1  —  a)-1  too  <  h(x). 

(14c) 

wo  >  v_i(  nJv(wj)\ 

(14d) 

Ve  M  7 

WJ  -  Yl  X(x'0jj')  -  1  sk\*b  k  e  IC, 

(14e) 

jeSk  jeSk 

Wj  >0,  j  €  A f. 

(14f) 

Clearly,  any  feasible  solution  of  (13)  is  also  feasible  for  (14),  and  the  optimal  value  of  (14)  represents  a 
lower  bound  for  that  of  (13).  Since  the  relaxed  problem  contains  fewer  scenario-based  constraints  (14e), 
it  is  potentially  easier  to  solve.  It  would  then  be  of  interest  to  determine  the  conditions  under  which  an 
optimal  solution  of  (14)  is  also  optimal  for  the  original  problem  (13).  Assuming  that  x*  is  an  optimal 
solution  of  (14),  consider  the  problem 


min  t]  +  (1  —  a)  1too 

(15a) 

S.  t.  too  >  W_1  ^  TtjVfWj)  ), 

(15b) 

OW  7 

Wj  >  X{x*  ,COj)  —  Tj,  j  €  A f, 

(15c) 

Wj  >0,  j  e  Af. 

(15d) 

Proposition  1  Consider  problem  (13)  and  its  relaxation  (14)  obtained  by  aggregating  scenario  con¬ 
straints  (13e)  over  sets  S ic,  k  e  1C,  that  form  a  partition  of  Af  =  {1, . . . ,  N}.  Assuming  that  (13)  is 
feasible,  consider  problem  (15)  where  x*  is  an  optimal  solution  of  relaxation  (14).  Let  (if* ,  w**)  be  an 
optimal  solution  of  (15).  If  the  optimal  value  of  (15)  satisfies  condition 

t]** +  (l-ay1wZ*  <h(x*),  (16) 

then  (x* .  r]** ,  w**)  is  an  optimal  solution  of  the  original  problem  (13). 

Proof:  Let  x°  be  an  optimal  solution  of  (13).  Obviously,  one  has  g(x*)  <  g(x°).  The  statement 
of  the  proposition  then  follows  immediately  by  observing  that  inequality  (16)  guarantees  the  triple 
(x*,  t]**,  w**)  to  be  feasible  for  problem  (13).  □ 

The  statement  of  Proposition  1  allows  one  to  solve  the  original  problem  (13)  by  constructing  an  appro¬ 
priate  partition  of  AT  and  solving  the  corresponding  master  problem  (14).  Below  we  outline  an  iterative 
procedure  that  accomplishes  this  goal. 

Step  0:  The  algorithm  is  initialized  by  including  all  scenarios  in  a  single  partition,  IC  =  {0},  So  — 

M . 

Step  1:  For  a  current  partition  {<S/C  :  k  €  IC},  solve  the  master  problem  (14).  If  (14)  is  infeasible,  then  the 
original  problem  (13)  is  infeasible  as  well,  and  the  algorithm  terminates.  Otherwise,  let  x*  be  an  optimal 
solution  of  the  master  (14). 
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Step  2:  Given  a  solution  x*  of  the  master,  solve  problem  (15),  and  let  (if  * .  w**)  denote  the  correspond¬ 
ing  optimal  solution.  If  condition  (16)  is  satisfied,  the  algorithm  terminates  with  (x*,  if* .  w**)  being  an 
optimal  solution  of  (13)  due  to  Proposition  1.  If,  however,  condition  (16)  is  violated, 

??**  +  (! -a)_1  wZ*>h(x*), 
then  the  algorithm  proceeds  to  Step  3  to  update  the  partition. 

Step  3:  Determine  the  set  of  scenario-based  constraints  in  (15)  that,  for  a  given  solution  of  the  master  x*, 
are  binding  at  optimality: 

J  =  {j  e  M  :  w**  =  X(x*,coj)  -  t]**  >  0}  (17) 

Then,  the  elements  of  J  are  removed  from  the  existing  sets  S/( : 

<$k  =  Sk  \  ft '  k  €  1C, 

and  added  to  the  partition  as  single-element  sets: 

{So,  •  •  •  U  {5^+i . where  SK+,  ={;'/}  foreach  jt  ej,  i  = 

and  the  algorithm  proceeds  to  Step  1 . 

Theorem  3  Assume  that  in  problem  (13)  functions  g(x)  and  X (x.  of)  are  convex  in  x,  h(x)  is  concave 
in  x,  v  satisfies  assumption  ( U1 ),  and  the  set  C  is  convex  and  compact.  Then ,  the  described  scenario 
decomposition  algorithm  either  finds  an  optimal  solution  of  problem  (13)  or  declares  its  infeasibility 
after  at  most  N  iterations. 

Proof:  Let  us  show  that  during  an  iteration  of  the  algorithm  the  size  of  the  partition  of  the  set  A f  of 
scenarios  increases  by  at  least  one. 

Let  {Sk  :  k  e  1C}  be  the  current  partition  of  N,  (x*.  if ,  w* )  be  the  corresponding  optimal  solution  of 
(14),  and  (rj** ,  w**)  be  an  optimal  solution  of  (15)  for  the  given  x*,  such  that  the  stopping  condition 
(16)  is  not  satisfied, 

t]**  +  (1  -a)-lwZ*  >  h(x*).  (18) 

Let  S*  denote  the  set  of  constraints  (15c)  that  arc  binding  at  optimality, 

5*  =  {;  :  wj*  =  X(x*,(oj)  -  rj**  >  0,  j  e  AA}. 

Next,  consider  a  problem  obtained  from  (15)  with  a  given  x*  by  aggregating  the  constraints  (15c)  that 
are  non-binding  at  optimality: 


min  i]  +  (l  —  a)  1uto 

(19a) 

s.  t.  wo  >  ^  njv(wj)\ 

(19b) 

N 

O 

CO 

01 

Wj  >  X(x*,a)j)-rj,  j  €  S* , 

(19c) 

(19d) 

jeS*  jeS* 

Wj  >0,  j  e  A f, 

(19e) 
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where  N*  —  AT  \  N*.  Obviously,  an  optimal  solution  (r]**,w**)  of  (15)  will  also  be  optimal  for  (19). 

Next,  observe  that  at  any  stage  of  the  algorithm,  the  partition  {Sk  :  k  €  /Cj  is  such  that  there  exists  at 
most  one  set  with  |5^|  >  1,  namely  set  No,  and  the  rest  of  the  sets  in  the  partition  satisfy  \Sk  |  =  1, 
k  f  0.  Let  us  denote 


So  —  Af  \  So  —  1J  Sk. 

kelC\{  0} 

Assume  that  N*  C  No-  By  rewriting  the  master  problem  (14)  as 


min  g(x) 

(20a) 

s.  t.  x  e  C, 

(20b) 

r]  +  (1  —  a)-1  wo  <  h(x), 

(20c) 

£  7tjV(Wj)\ 

(20d) 

Wj  >  X(x,  u>j)  —  rj,  j  €  So, 

(20e) 

£  wj  >  £  X(x,coj)  -\S0\r], 

(20f) 

j€S0  jeS0 

l Vj  >0,  j  €  N, 

(20g) 

we  observe  that  the  components  rf ,  w*  of  its  optimal  solution  arc  feasible  for  (19).  Indeed,  from  (20e) 
one  has  that 

W*  >  X (x* ,  00 j )  -  rf,  j  e  N*, 

which  satisfies  (19c),  and  also 

w*  >  X(x*,coj)  -rf,  j  e  S0\S*  —  S*  \  S0. 

Adding  the  last  inequalities  yields 

£  U*>  E  X(x*,a>j)  -  |<S*  \  <S0|  rf , 

jeS*\So  j€S*\S o 

which  can  then  be  aggregated  with  (20f)  to  produce 

£<>  £  x(x*,a)j)-\S*\rj*, 
jeS*  jeS* 

verifying  the  feasibility  of  (r]*,  w*)  for  (19).  Since  (20c)  has  to  hold  for  (x*.  rf ,  w*),  we  obtain  that 

rj**  +  (1  -  <ri*  +  (l-  a)~lw*  <  h(x*), 

which  furnishes  a  contradiction  with  (18).  Therefore,  one  has  to  have  No  C  S*  for  (18)  to  hold,  meaning 
that  at  least  one  additional  scenario  from  S*  will  be  added  to  the  partition  during  Step  3  of  the  algorithm. 
It  is  easy  to  see  that  the  number  of  iterations  cannot  exceed  the  number  N  of  scenarios.  □ 
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Remark  2  The  fact  that  the  proposed  scenario  decomposition  method  terminates  within  at  most  N  it¬ 
erations  represents  an  important  advantage  over  several  existing  cutting-plane  methods  that  were  devel¬ 
oped  in  the  literature  for  problems  involving  Conditional  Value-at-Risk  measure  (Kiinzi-Bay  and  Mayer, 
2006),  integrated  chance  constraints  (Klein  Haneveld  and  van  der  Vlerk,  2006),  and  SSD  constraints 
(Roman  et  ah,  2006).  In  the  mentioned  works,  the  cutting-plane  algorithms  utilized  supporting  hy¬ 
perplane  representations  for  scenario  constraints,  which  were  themselves  exponential  in  the  size  N  of 
scenario  sets.  Although  finite  convergence  of  the  cutting  plane  techniques  was  guaranteed  by  the  poly¬ 
hedral  structure  of  the  scenario  constraints  (in  the  case  when  A(x.  co)  is  linear  in  x),  no  estimate  for  the 
sufficient  number  of  iterations  was  provided.  A  level-type  regularization  of  cutting  plane  method  for 
problems  with  SSD  constraints,  which  allows  for  an  estimate  of  the  number  of  cuts  due  to  Lemarechal 
et  al.  (1995),  is  discussed  in  Fabian  et  al.  (2011). 

3.1  An  Efficient  Solution  Method  for  Sub-Problem  (15) 

Although  formulation  (15)  may  be  solved  using  appropriate  mathematical  programming  techniques,  an 
efficient  alternative  solution  method  can  be  employed  by  noting  that  (15)  is  equivalent  to 

min  V  +  X!  njv(X(x*,(Oj)^  q)+  ),  (21) 

\jeM  / 

which  is  a  mathematical  programming  implementation  of  representation  (3)  under  a  finite  scenario  model 
where  realizations  A(x* ,  coj)  represent  scenario  losses  corresponding  to  an  optimal  decision  x*  in  the 
master  problem  (14).  An  optimal  value  of  q  in  (15)  and  (21)  can  be  computed  directly  using  its  properties 
dictated  by  representation  (3). 

Namely,  let  Xj  —  X(\* .  wj)  represent  the  optimal  loss  in  scenario  j  for  problem  (14),  and  let  be 
the  m-th  smallest  outcome  among  X\ . X ^ ,  such  that 

^(t)  <  x(2)  <  •  •  •  <  X{N). 

The  following  proposition  enables  evaluation  of  rf*  as  a  “cutoff’  point  within  the  tail  of  the  loss  distri¬ 
bution. 

Proposition  2  Given  a  function  v(-)  that  satisfies  ( U1 )  and  an  a  e  (0, 1),  a  sufficient  condition  for  rj** 
to  be  an  optimal  solution  in  problems  (21)  and  (15)  has  the  form 

^2j:Xi>r)**  njv'(Xj  —  q**) 

- - - - - - +  a  -  1  =  0,  (22) 

v'(v  l(EjeMnJv(X  *->?**)+)) 

where  v'  denotes  the  derivative  of  v. 

Proof:  The  underlying  assumption  (Ul)  on  v  entails  that  f(X)  —  (1  —  a)~l  (A1  Ei;(  A )  is  convex, 
whence  the  objective  function  of  (21) 

®x(ri)  =  V  +  (P(X  ~  V)  =  h  +  f~V~X  (  Eyg  M  nJv(XJ  ~  ?^  +  ) 
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is  convex  on  M.  Moreover,  the  condition  (p(rj)  >  r/  for  r/  ^  0  of  Theorem  1  guarantees  that  the  set  of 
minimizers  of  ^xiv)  is  compact  and  convex  in  M.  Indeed,  it  is  easy  to  see  that  Qxiv)  —  V  f°r  V  >  ^(JV) 
and  ®x(r})  ~  for  r]  <<C  -1. 

Now,  consider  the  left  derivative  of  &x(v)  at  a  given  point  >]  —  rj**: 


(1  -  a)  +  {l  -a)^-^x(ri) 
dr/ 


d~ 

dr] 


-ih) 


=  lim  —  <  v 
€->0  + 


Y  nJv{xj ~ri** +  e)) -v  J(  Y 

l  V  j .  Xj  >  77  *  *  S 


njv(Xj  -  rj**) 


d~_ 

dr] 


,-t 


Y  nJv(xJ  -  ^ 


W=V 


d_ 

dr] 


,-t 


Y  niv(xj  ~  'i) 


where  the  last  equality  follows  from  the  continuous  differentiability  of  function 
n_1  ( Ey-x>n**  njv(xj  ~  V**))  at  iiie  point  rj**  due  to  the  assumed  properties  of  v.  Analogously,  the 
right  derivative  of  Qxiv)  at  r]  —  rj**  equals  to 


d+ 

dr] 


®x(r]) 


v=v** 


1  + 


1  d 
1  —  a  dr] 


,-t 


Y  njv(xJ 

j:Xj>ri** 


rj) 


where  the  strict  inequality  in  summation  is  due  to  fact  that  v(Xj  —  rj**  —  e)+  =  0  for  all  c  >  0  if 
ri**<Xj. 

Observe  that  <t>x ( rl )  may  only  be  non-differentiable  at  points  rj  =  Xj.  Indeed,  for  any  rj**  ^  Xj,  j  €  J\f, 
the  obtained  expressions  for  left  and  right  derivatives  become  equivalent,  and  equation  (22)  is  obtained 
from  the  first  order  optimality  conditions  by  computing  the  derivatives  of  the  functions  in  braces  and 
noting  that  J2j-.Xj>r,**  njv(xj  ~  O  =  T,j:Xj>r,**  71 JV(XJ  ~  >?**)  =  Eye M  7T.iv(xJ  ~  rl**)+ ■  n 


Recall  that  the  presented  above  scenario  decomposition  algorithm  uses  the  subproblem  (15)  for  deter¬ 
mining  an  optimal  value  of  )]** ,  as  well  as  for  identifying  (during  Step  3)  the  set  J  of  scenarios  that  arc 
binding  at  optimality,  i.e.,  for  which  X(x*,  coj)  —  rj**  >  0.  This  can  be  accomplished  with  the  help  of 
the  derived  optimality  condition  (22)  as  follows. 

Step  (i)  Compute  values  Xj  —  X(x*,coj),  where  x*  is  an  optimal  solution  of  (14),  and  sort  them  in 
ascending  order:  Xq)  <  . . .  <  X^y 

Step  (ii)  For  m  =  A,  N  —  compute  values  Tm  as 


T x  —  1  —  a. 


Tm  —  1 


■  a 


E/= 


N 

■j=m+ 1  71 J 


iv  (X(j)  X(m)) 


v'  (v 


1  (Ey=m  +  1  niv{XU)  Em}))) 


m  =  N  - 


(24) 


until  m  *  is  found  such  that 


T,n*  —  0,  >  0. 


(25) 


Step  (iii)  If  Tm*  =  0,  then  the  solution  if*  of  (15),  (21)  is  equal  to  X(m*y  Otherwise,  rj**  satisfies 

V**  e  (X(m*) ,  X(m*  + 1)] , 
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and  its  value  can  be  found  by  using  an  appropriate  numerical  procedure,  such  as  Newton’s  method.  The 
set  J  in  (17)  is  then  obtained  as 

J  —  {j  :  Xj  =  X(k),  k  =  m*  +  1 . N}. 

Proposition  3  Given  an  optimal  solution  x*  of  the  master  problem  (14),  the  algorithm  described  in  steps 
(i)-(iii)  yields  an  optimal  value  rf*  in  (15),  (21)  and  the  set  J  to  be  used  during  steps  2  and  3  of  the 
scenario  decomposition  algorithm. 

Proof:  First,  observe  that  an  optimal  solution  rj**  of  (15)  and  (21)  satisfies  rj**  <  X(jyy  Indeed,  assume 
to  the  contrary  that  rf*  =  X^x)  +  e  for  some  e  >  0.  The  optimal  value  of  (15)  and  (3)  is  then  equal  to 
X(N)  +  and  can  be  improved  by  selecting,  e.g.,  e  =  e/2. 

Next,  observe  that  quantities  Tm  arc  equal,  up  to  a  factor  1  —  a,  to  the  right  derivatives  of  function  <&x  ( t] ) 
(23)  at  t]  =  X(my  i.e.,  Tm  =  (1  —  a)yf^<&x(r])\ri=x(  p  The  value  of  T)v  =  1  —  a  follows  directly 
from  the  fact  that  Qxih)  —  V  for  V  —  X^y  Then,  if  strict  inequalities  in  (25)  hold,  two  cases  are 
possible.  Namely,  an  optimal  rj**  is  located  inside  the  interval  (X^m*y  X(m*+ 1 j)  if  yyy<i>x(X{m>x\))  > 

0.  Alternatively,  r)**  =  X(m*+y  if  pjy^>x  (Xpn*  +  \f  <  0.  Thus,  we  have  the  second  statement  of  step 
(iii). 

If  Tm*  =  0  in  (25),  observe  that  necessarily  pjy<$>x(Xm* )  <  0  since  the  left  derivative  of  <t>x  at  X(m> 
differs  from  the  expression  (24)  by  an  extra  summand  jzmv'(0)  in  the  numerator.  If  v'(0)  =  0  then 
$>x(Xm*)  =  yp^^x(Xm*)  —  0  and  rj**  —  Xyn*)  is  a  minimum  due  to  Proposition  2.  If  v'(0)  >  0 

then  yy<t‘x(Xm*)  <  0  and  )]**  —  X(m*)  is  again  either  a  unique  minimizer,  or  represents  the  left 
endpoint  of  the  set  of  minimizers.  This  validates  the  first  claim  of  step  (iii). 

Once  the  value  of  if*  is  obtained  during  step  (iii),  the  set  J  in  (17)  is  constructed  as  the  set  of  scenario 
indices  corresponding  to  X(m*+]y  X(m*+2) . X^y 

Note  that  it  is  not  necessary  to  prove  that  there  always  exists  m*  €  { 1 . A  —  1}  such  that  Tm*  <  0 

and  Tm*+i  >  0.  If  indeed  it  were  to  happen  that  Tm  >  0  for  all  m  =  1, . . . ,  N,  this  would  imply  that  set 
J  must  contain  all  scenarios,  i.e.,  J  —  A f,  making  the  exact  value  of  if  *  irrelevant  in  this  case,  since 
the  original  problem  (13)  would  have  to  be  solved  at  the  next  iteration  of  the  scenario  decomposition 
algorithm.  □ 

Remark  3  We  conclude  this  section  by  noting  that  the  presented  scenario  decomposition  approach  is 
applicable,  with  appropriate  modifications,  to  more  general  forms  of  downside  risk  measures  p(X )  = 
min v{t]  +  <p((X  —  t})+)}.  The  focus  of  our  discussion  on  the  case  when  function  <f  has  the  form  of  a 
certainty  equivalent,  cp{X)  =  i>_1Eu(X+),  is  dictated  mainly  by  the  fact  that  the  resulting  constraint 
(13d)  encompasses  a  number  of  interesting  and  practically  relevant  special  cases,  such  as  second-order 
cone,  p-order  cone,  and  log-exponential  constraints. 


4  Computational  Experiments:  Portfolio  Optimization  with  HMCR  and 
LogExpCR  Measures 


Portfolio  optimization  problems  arc  commonly  used  as  an  experimental  platform  in  risk  management 
and  stochastic  optimization.  In  this  section  we  illustrate  the  computational  performance  of  the  proposed 
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scenario  decomposition  algorithm  on  a  portfolio  optimization  problem,  where  the  investment  risk  is 
quantified  using  HMCR  or  LogExpCR  measures. 

A  standard  formulation  of  portfolio  optimization  problem  entails  determining  the  vector  of  portfolio 

weights  x  =  (xi . xn)T  of  n  assets  so  as  to  minimize  the  risk  while  maintaining  a  prescribed  level 

of  expected  return.  We  adopt  the  traditional  definition  of  portfolio  losses  X  as  negative  portfolio  returns, 
X(x,  a>)  —  — r(w)Tx,  where  r(w)  =  ( r\(u> ), . . . ,  r„(cn))T  are  random  returns  of  the  assets.  Then,  the 


portfolio  selection  model  takes  the  general  form 

min  p(  —  r(w)Tx)  (26a) 

s.  t.  lTx  =  1,  (26b) 

E[r(w)Tx]  >  r,  (26c) 

x  >  0,  (26d) 


where  1  =  (1, . . . ,  1)T,  equality  (26b)  represents  the  budget  constraint,  (26b)  ensures  a  minimum  ex¬ 
pected  portfolio  return  level,  r,  and  (26d)  corresponds  to  no-short-selling  constraints. 

The  distribution  of  the  random  vector  r(co)  of  assets’  returns  is  given  by  a  finite  set  of  N  equiprobable 
scenarios  r ,•  =  r (ojj)  =  (r \j, . . . ,  rnj)T, 

nj  =  P{r  =  (rij,...,rnj)T)  =  l/N,  j  €  J\f  =  {1 . N}.  (27) 

4.1  Portfolio  Optimization  with  Higher  Moment  Coherent  Risk  Measures 

In  the  case  when  risk  measure  p  in  (26)  is  selected  as  a  higher  moment  coherent  risk  measure,  p(X)  — 
HMCRp.^fA ),  the  portfolio  optimization  problem  (26)  can  be  written  in  a  stochastic  programming  form 


that  is  consistent  with  the  general  formulation  (13)  as 

min  r]  +  (1  —  a)-1  wo  (28a) 

s.  t.  w0  >  IKioi,  •  •  • ,  uw)llj?>  (28b) 

rcJl^pWj  >  — r Jx-rj,  j  e  Af,  (28c) 

x  e  C,  w  >  0,  (28d) 

where  C  represents  a  polyhedral  set  comprising  the  expected  return,  budget,  and  no-short-selling  con¬ 
straints  on  the  vector  of  portfolio  weights  x: 

C  =  |xeK"  :  njrjx  >  r,  lTx  =  1,  x  >  oj.  (29) 


Due  to  the  presence  of  p-ordcr  cone  constraint  (28b),  formulation  (28)  constitutes  a  /; -order  cone  pro¬ 
gramming  problem  (pOCP). 

Solution  methods  for  problem  (28)  are  dictated  by  the  specific  value  of  parameter  p  in  (28b).  As  has 
been  mentioned,  in  the  case  of  p  —  1  formulation  (28)  reduces  to  a  LP  problem  that  corresponds  to  a 
choice  of  risk  measure  as  the  CVaR,  a  case  that  has  received  a  considerable  attention  in  the  literature.  In 
view  of  this,  of  particular  interest  are  nonlinear  instances  of  problem  (28),  which  correspond  to  values  of 
the  parameter  p  e  (1,  +oo). 
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Below  we  consider  instances  of  (28)  with  p  —  2  and  p  =  3.  In  the  case  of  p  =  2,  problem  (28) 
can  be  solved  using  SOCP  self-dual  interior  point  methods.  In  the  case  of  p  —  3  and,  generally,  p  € 
(1,2)  U  (2,  oo),  the  /;-conc  (28b)  is  not  self-dual,  and  we  employ  two  techniques  for  solving  (28)  and 
the  corresponding  master  problem  (14):  (i)  a  SOCP-based  approach  that  relies  on  the  fact  that  for  a 
rational  p,  a  /(-order  cone  can  be  equivalently  represented  via  a  sequence  of  second  order  cones,  and 
(ii)  an  LP-based  approach  that  allows  for  obtaining  exact  solutions  of  pOCP  problems  via  cutting-plane 
methods. 

Detailed  discussions  of  the  respective  formulations  of  problems  (28)  are  provided  below.  Throughout  this 
section,  we  use  abbreviations  in  brackets  to  denote  the  different  formulations  of  the  “complete”  versions 
of  (28)  (i.e.,  with  complete  set  of  scenario  constraints  (28c)).  For  each  “complete”  formulation,  we 
also  consider  the  corresponding  scenario  decomposition  approach,  indicated  by  suffix  “SD”.  Within  the 
scenario  decomposition  approach,  we  present  formulations  of  the  master  problem  (denoted  by  subscript 
“MP”);  the  respective  subproblems  arc  then  constructed  accordingly.  For  example,  the  SOCP  version  of 
the  complete  problem  (28)  with  p  —  2  is  denoted  [SOCP],  while  the  same  problem  solved  by  scenario 
decomposition  is  referred  to  as  [SOCP-SD],  with  the  master  problem  being  denoted  as  [SOCP-SD]mp 
(see  below). 


4.1.1  SOCP  Formulation  in  p  =  2  Case. 


In  case  when  p  =  2,  formulation  (28)  constitutes  a  standard  SOCP  problem  that  can  be  solved  using  a 
number  of  available  SOCP  solvers,  such  as  CPLEX,  MOSEK,  GUROBI,  etc.  In  order  to  solve  it  using 
the  scenario  decomposition  algorithm  presented  in  Section  3,  the  master  problem  (14)  is  formulated  with 
respect  to  the  original  problem  (28)  with  p  =  2  as  follows: 


min  r]  +  (1  —  a) 

s.  t.  W0>||(wi . Whv)||2, 

1/2 


E 

j£S, t 


71  ; 


71 


Jk)WJ  ~ 


x  —  t],  k  €  1C, 


[SOCP-SD]mp 


Note  that  in  the  case  of  HMCR2.«  measure,  the  function  v (t )  =  l2  is  positive  homogeneous  of  degree 
two,  which  allows  for  eliminating  the  scenario  probabilities  nj  from  constraint  (14d)  and  representing  the 
latter  in  the  form  of  a  second  order  cone  in  the  full  formulation  (28)  and  in  the  master  problem  [SOCP- 
SDJmp-  This  affects  constraints  (14d),  which  then  can  be  written  in  the  form  of  the  second  constraint  in 
[SOCP-SD]mp-  The  subproblem  (15)  is  reformulated  accordingly. 


4.1.2  SOCP  Reformulation  of  /t-Order  Cone  Program. 


One  of  the  possible  approaches  for  solving  the  pOCP  problem  (28)  with  p  —  3  involves  reformulating  the 
p- cone  constraint  (28b)  via  a  set  of  quadratic  cone  constraints.  Such  an  exact  reformulation  is  possible 
when  the  parameter  p  has  a  rational  value,  p  =  q/s.  Then,  a  (g/s) -order  cone  constraint  in  the  positive 
orthant 


,N+ 1 


{w  >  0 


Wo  >  (w \,s  + 


+  utr'p 


(30) 
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may  equivalently  be  represented  as  the  following  set  in  M+  +  1  xlj: 

{w,u>0  :  wo  >  ||u||i,  wj  <UjWq~s,  j  eAA}.  (31) 

Each  of  the  N  nonlinear  inequalities  in  (31)  can  in  turn  be  represented  as  a  sequence  of  three-dimensional 
rotated  second-order  cones  of  the  form  <  ^1^2-  resulting  in  a  SOCP  reformulation  of  the  rational-order 
cone  (30)  (Nesterov  and  Nemirovski,  1994;  Alizadeh  and  Goldfarb,  2003;  Krokhmal  and  Soberanis, 
2010).  Such  a  representation,  however,  is  not  unique  and  in  general  may  comprise  a  varying  number  of 
rotated  second  order  cones  for  a  given  p  =  q/s.  In  this  case  study  we  use  the  technique  of  Morenko 
et  al.  (2013),  which  allows  for  representing  rational  order  /(-cones  with  p  =  q/s  in  pN+1  via  N  [~log2  q] 
second  order  cones.  Namely,  in  the  case  of  p  =  3,  when  q  —  3,  s  =  1,  the  3-order  cone  (30)  can 
equivalently  be  replaced  with  [~log2  3]  N  —  2 N  quadratic  cones 

{w,u,  v  >0  :  wo  >  ||u||i,  wj  <  woVj,  vj  <  WjUj,  j  e  AA}.  (32) 

In  accordance  with  the  above,  a  /(-order  cone  inequality  in  M^+1  can  be  represented  by  a  set  of  3D 
second  order  cone  constraints  and  a  linear  inequality  when  p  is  a  positive  rational  number.  Thus,  the 
[SpOCP]  problem  (28)  takes  the  following  form: 

min  rj  +  (l—a)~1wo 
S.  t.  U>0  >  ||u||l, 

wj  <  WoVj,  vj  <  WjUj,  j  €  AT.  [SpOCP] 

njl  pWj  >  — r •Jx—rj,  j  eA f, 

X  €  C,  w,  v,u  >  0. 


The  corresponding  master  problem  sub-problem  [SpOCP-SDJmp  in  the  scenario  decomposition-based 
method  is  constructed  by  replacing  constraints  of  the  form  (28c)  in  the  last  problem  as  follows: 

min  r]  +  (1  —  a)~1wo 
S.  t.  Wo  >  ||u||l, 


wj  <  WoVj , 


V 


<  Wj  Uj ,  j  €  AT, 


E 

j£Sk 


nJ 


1-1  Ip 


n 


(k) 


-w 


7  > 


E 


71 


J  _T 


TC 


(k)  TJ 


q,  k  etC, 


[SpOCP-SD]Mp 


x  €  C,  w,  v,u  >  0. 


4.1.3  An  Exact  Solution  Method  for  pOCP  Programs  Based  on  Polyhedral  Approximations. 

Computational  methods  for  solving  /(-order  cone  programming  problems  that  arc  based  on  polyhedral 
approximations  (Krokhmal  and  Soberanis,  2010;  Vinel  and  Krokhmal,  2014b)  represent  an  alternative  to 
interior-point  approaches,  and  can  be  beneficial  in  situations  when  a  pOCP  problem  needs  to  be  solved 
repeatedly,  with  small  variations  in  problem  data  or  problem  structure. 

Thus,  in  addition  to  the  SOCP-based  approaches  for  solving  the  pOCP  problem  (28)  discussed  above,  we 
also  employ  an  exact  polyhedral -based  approach  with  0{s~l)  iteration  complexity  that  was  proposed  in 
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Vinel  and  Krokhmal  (2014b).  It  consists  in  reformulating  the  p- order  cone  Wq  >  ||  (uq . wn)\\p  via 

a  set  of  three-dimensional  />-cones 


Wo  =  W2N-1,  WN+j  >\\(W2j-l,W2j)\\p,  j  =  -  1,  (33) 

and  then  iteratively  building  outer  polyhedral  approximations  of  the  3D  /;- cones  until  the  solution  of 
desired  accuracy  e  >  0  is  obtained. 


||(u;i,...,uqv)||p  <  (1  +  s)wo- 

In  the  context  of  the  lifted  representation  (33),  the  above  e-relaxation  of  p-cone  inequality  translates  into 
IV  —  1  corresponding  approximation  inequalities  for  3D  p -cones: 

||(u£/_l,U>2y)ll/>  <  (1  +0«’^+y,  j  =  1,  (34) 

where  e  —  (1  +  s)x^V)g2  —  1.  Then,  for  a  given  e  >  0,  an  e-approximate  solution  of  pOCP  portfolio 
optimization  problem  (28)  is  obtained  by  iteratively  solving  the  linear  programming  problem 

min  rj  +  (1  —  a)~1wo 
s.  t.  wo  =  w2n-i, 

WN+j  >  Up(6kj)w2j-1  +  Pp(0kj)w2j,  9kj  €&j,  y  =  l . N  -  1,  [LpOCP] 

7i  ~  pWj  >  —  rjx  —  r],  j  e  AT, 
x  e  C,  w  >  0, 


where  coefficients  ap  and  fJ>p  arc  defined  as 

cos-P-1  0 


sin-P-1  9 

ap(9)  =  - — - 1 — — r,  PP(9)  =  - — r- 

(cos-P  9  +  sinP  9y  p  (cos-P  9  +  sinp  9 ) 1  ’> 


If,  for  a  given  solution  w*  =  (w^, . . . ,  w^N_ j)  of  [LpOCP],  the  approximation  condition  (34)  is  not 
satisfied  for  some  j  =  1 . IV  —  1 , 


||(W2/_1,U>2y-)||/,  >  (!  +  e)u>N+j’ 


(35) 


then  a  cut  of  the  form 


W2i 

wN+j  >  ap(9*)w2j-i  +  Pp(0*)w2j,  9*  —  arctan  — - - ,  (36) 

W2j- 1 

is  added  to  [LpOCP].  The  process  is  initialized  with  ©y  =  {^i},  9\  —  n / 4,  j  =  1, . . . ,  N  —  1,  and  con¬ 
tinues  until  no  violations  of  condition  (35)  arc  found.  In  Vinel  and  Krokhmal  (2014b)  it  was  shown  that 
this  cutting -plane  procedure  generates  an  e-approximate  solution  to  pOCP  problem  (28)  within  0(e-1) 
iterations. 

The  described  cutting  plane  scheme  can  be  employed  to  solve  the  master  problem  corresponding  to 
the  pOCP  problem  (28).  Namely,  the  cutting-plane  formulation  of  this  master  problem  is  obtained  by 


17 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


replacing  the  p- cone  constraint  (28b)  with  cutting  planes  similarly  to  [LpOCP],  and  the  set  of  N  scenario 
constraints  (28c)  with  the  aggregated  constraints  (compare  to  [SpOCP-SDJmp): 

min  77  +  ( 1  —  a)~  lt 


s.  t.  W0  =  W2N-i, 

wN+j  >  ap(9kj)w2j-i  +  Pp(Okj)w2j,  0kj  6  ®j, 


j  =  1 . N  —  l, 


E 


1-1  Ip 


7T 


wrWj  - 


jeSk 

x  e  C,  w  >  0. 


E 

G'e<S/t 


[LpOCP-SD]LB 


TCi 


J  _T 


n 


(k)  J 


r),  k  €  1C, 


4.2  Portfolio  Optimization  with  Log  Exponential  Convex  Risk  Measures 

In  order  to  demonstrate  the  applicability  of  the  proposed  method  when  solving  problems  with  measures 
of  risk  other  than  the  HMCR  class,  we  examine  an  analogous  experimental  framework  for  instances 
when  p(X )  =  LogExpCRe  a(X ).  The  portfolio  optimization  problem  (26)  may  then  be  written  as 

min  T)  +  (1  —  a)-1  wo 
s.  t.  wo  >  In  njeWj  - 

jeJV  [LogExpCP] 

wj  >  — r J x-T],  j  e  J\[, 

x  e  C,  w  >  0. 

Note  that  in  contrast  to  pOCP  and  SOCP  problems  discussed  in  the  preceding  subsections,  the  above 
formulation  is  not  a  conic  program.  Since  it  involves  a  convex  log-exponential  constraint,  we  call  this 
problem  a  log-exponential  convex  programming  problem  (LogExpCP)  that  can  be  solved  with  interior 
point  methods. 

The  corresponding  master  problem  for  the  scenario  decomposition  algorithm  is  obtained  from  [LogEx¬ 
pCP]  by  aggregating  the  scenario  constraints  in  accordance  to  (14): 

min  tj  +  (1  —  a)-1  ir>o 

s.  t.  wo  >  In  22  njeWj  ’ 

jeM  [LogExpCP-SD]Mp 

J2  WJ  -  ~  r7x  “  \Sk\V<  k  e/C, 

jeSk  jeSk 

x  e  C,  w  >  0. 

In  the  next  section  we  examine  the  computational  performances  within  each  implementation  class  of 
problem  (28). 


4.3  Computational  Results 

The  portfolio  optimization  problems  described  in  Section  4. 1  and  4.2  were  implemented  in  C++  using 
callable  libraries  of  three  solvers,  CPLEX  12.5,  GUROBI  5.02,  and  MOSEK  6.  Computations  ran  on 
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a  six-core  2.30GHz  PC  with  128GB  RAM  in  64-bit  Windows  environment.  In  the  context  of  bench¬ 
marking,  each  adopted  formulation  was  tested  against  its  scenario  decomposition-based  implementation. 
Moreover,  it  was  of  particular  interest  to  examine  the  performance  of  the  scenario  decomposition  algo¬ 
rithm  using  various  risk  measure  configurations,  thus,  the  following  problem  settings  were  solved:  prob¬ 
lems  [SOCP]-[SOCP-SD]  with  risk  measure  as  defined  by  (5)  for  p  —  2;  problems  [SpOCP]-[SpOCP- 
SD]  and  [LpOCP]-[LpOCP-SD]  with  measure  (5)  for  p  =  3;  and  problems  [LogExpCP]-[LogExpCP- 
SD]  with  risk  measure  (6).  The  value  of  parameter  a  in  the  employed  risk  measures  was  fixed  at  a  =  0.9 
throughout. 

The  scenario  data  in  our  numerical  experiments  was  generated  as  follows.  First,  a  set  of  n  stocks  (n  =  50, 
100,  200)  was  selected  at  random  from  the  S&P500  index.  Then,  a  covariance  matrix  of  daily  returns  as 
well  as  the  expected  returns  were  estimated  for  the  specific  set  of  n  stocks  using  historical  prices  from 
January  1,  2006  to  January  1,  2012.  Finally,  the  desired  number  N  of  scenarios,  ranging  from  1,000  to 
100,000,  have  been  generated  as  N  independent  and  identically  distributed  samples  from  a  multivariate 
normal  distribution  with  the  obtained  mean  and  covariance  matrix. 

On  account  of  precision  arithmetic  errors  associated  with  the  numerical  solvers,  we  introduced  a  toler¬ 
ance  level  e  >  0  to  specify  the  permissible  gap  in  the  stopping  criterion  (16): 

r)** +  (\-a)~lwl*  <h(x*)  +  e.  (37) 

Specifically,  the  value  e  =  10~5  was  was  chosen  to  match  the  reduced  cost  of  the  simplex  method  in 
CPLEX  and  GUROBI.  In  a  similar  manner,  we  adjust  (24)  around  m*  for  precision  errors  as 

Tm*+\  (p)  -  e  <  0  and  Tm*  (p)  +  e  >  0. 

Empirical  observations  suggest  the  accumulation  of  numerical  errors  is  exacerbated  by  the  use  of  frac¬ 
tional  values  of  scenarios  in  assets  returns,  r,7 .  To  alleviate  the  numerical  accuracy  issues,  the  data  in 
respective  problem  instances  of  the  scenario  decomposition  algorithm  were  appropriately  scaled. 

The  results  of  our  numerical  experiments  are  summarized  in  Tables  1-5.  Unless  stated  otherwise,  the 
reported  running  time  values  arc  averaged  over  20  instances.  Table  1  presents  the  computational  times 
observed  during  solving  the  full  formulation,  [SOCP],  of  problem  (28)  with  HMCR  measure  and  p  —  2, 
and  solving  the  same  problem  using  the  scenario  decomposition  algorithm,  [SOCP-SD],  with  the  three 
solvers,  CPLEX,  GUROBI,  and  MOSEK.  Observe  that  the  scenario  decomposition  method  performs 
better  for  all  instances  and  solvers,  with  the  exception  of  the  largest  three  scenario  instances  when  using 
GUROBI  with  n  —  50  assets.  However,  this  trend  is  tampered  as  the  number  of  assets  increases. 

Table  2  reports  the  running  times  observed  during  solving  of  the  second-order  cone  reformulation  of 
the  pOCP  version  of  problem  (28)  with  p  —  3,  in  the  full  formulation  ([SpOCP])  and  via  the  scenario 
decomposition  algorithm  ([SpOCP-SD]).  The  obtained  results  indicate  that,  although  the  scenario  de¬ 
composition  algorithm  is  slower  on  smaller  problem  instances,  it  outperforms  direct  solution  methods  as 
the  numbers  of  scenarios  N  and  assets  n  in  the  problem  increase.  Due  to  observed  numerical  instabilities, 
the  CPLEX  solver  was  not  considered  for  this  particular  experiment. 

Next,  the  same  problem  is  solved  using  using  the  polyhedral  approximation  cutting-plane  method  de¬ 
scribed  in  Section  4.1.  Table  3  shows  the  running  times  achieved  by  all  three  solvers  for  problems 
[LpOCP]  and  [LpOCP-SD]  with  p  =  3.  In  this  case,  the  scenario  decomposition  method  resulted  in 
order-of-magnitude  improvements,  which  can  be  attributed  to  the  “warm-start”  capabilities  of  CPLEX 
and  GUROBI’s  simplex  solvers.  Consistent  with  these  conclusions  is  also  the  fact  that  the  simplex- 
based  solvers  of  CPLEX  and  GUROBI  yield  improved  solution  times  on  the  full  problem  formulation 
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CPLEX 

GUROBI 

MOSEK 

n 

N 

[SOCP] 

[SOCP-SD] 

[SOCP] 

[SOCP-SD] 

[SOCP] 

[SOCP-SD] 

50 

1000 

1.00 

0.46 

0.62 

0.45 

0.26 

0.15 

2500 

3.03 

0.51 

1.88 

1.07 

0.60 

0.36 

5000 

6.58 

0.55 

3.81 

2.78 

1.24 

0.72 

10000 

13.72 

1.35 

9.56 

7.89 

2.56 

1.61 

25000 

31.03 

3.53 

32.40 

34.04 

7.33 

5.18 

50000 

60.62 

9.05 

101.09 

117.24 

17.64 

12.43 

100000 

137.14 

25.25 

327.95 

449.78 

36.78 

33.02 

too 

1000 

2.46 

0.86 

1.73 

0.42 

0.61 

0.18 

2500 

6.14 

0.99 

4.87 

1.17 

1.50 

0.47 

5000 

13.69 

1.10 

11.13 

3.55 

3.25 

1.15 

10000 

27.06 

2.21 

21.94 

9.63 

6.69 

3.03 

25000 

72.95 

8.85 

71.34 

37.48 

20.41 

6.88 

50000 

157.25 

20.88 

185.56 

129.37 

44.01 

16.61 

100000 

319.90 

58.29 

464.12 

467.35 

79.75 

41.58 

200 

1000 

6.87 

2.19 

5.60 

0.58 

6.68 

0.29 

2500 

17.48 

2.10 

15.36 

1.37 

4.49 

0.73 

5000 

34.93 

2.98 

33.96 

4.15 

9.36 

1.92 

10000 

76.13 

5.03 

63.67 

16.50 

19.54 

5.51 

25000 

206.29 

24.16 

196.45 

54.00 

53.89 

29.15 

50000 

447.85 

55.93 

438.40 

152.76 

112.47 

28.85 

100000 

950.17 

112.60 

998.86 

539.46 

234.68 

61.98 

Table  1:  Average  computation  times  (in  seconds)  obtained  by  solving  problems  [SOCP]  and  [SOCP-SD]  for  p  =  2 
using  CPLEX,  GUROBI  and  MOSEK.  All  running  times  are  averaged  over  20  instances. 


comparing  to  the  SOCP-based  reformulation  [SpOCP],  where  barrier  solvers  were  invoked.  The  discrep¬ 
ancy  between  [LpOCP]  and  [LpOCP-SD]  solution  times  is  especially  prominent  for  MOSEK,  but  in  this 
case  it  appears  that  MOSEK’s  interior-point  LP  solver  was  much  less  effective  at  solving  the  [LpOCP] 
formulation  using  the  cutting  plane  method. 

Finally,  Table  4  displays  the  running  times  for  the  discussed  implementation  of  problems  [LogExpCR] 
and  [LogExpCP-SD].  Of  the  three  solvers  considered  in  this  case  study,  only  MOSEK  was  capable  of 
handling  problems  with  constraints  that  involve  sums  of  univariate  exponential  functions.  Again,  the 
scenario  decomposition-based  solution  method  appeal's  to  be  preferable  in  comparison  to  solving  the  full 
formulation.  Note,  however,  that  computational  times  were  not  averaged  over  20  instances  in  this  case 
due  to  numerical  difficulties  associated  with  the  solver  for  many  instances  of  [LogExpCP]. 

It  is  also  of  interest  to  comment  on  the  number  of  scenarios  that  had  to  be  generated  during  the  scenario 
decomposition  procedure  in  order  to  yield  an  optimal  solution.  Table  5  lists  the  corresponding  average 
number  of  scenarios  partitioned  for  each  problem  type  over  all  instances.  Although  these  numbers  may 
slightly  differ  among  the  three  solvers,  we  only  present  results  for  MOSEK  as  it  was  the  only  solver  used 
to  solve  all  the  problem  in  Sections  4. 1  and  4.2.  Observe  that  far  fewer  scenarios  are  required  relative 
to  the  total  set  size  N .  In  fact,  as  a  percentage  of  the  total  number  of  scenarios,  the  number  of  scenarios 
that  were  generated  during  the  algorithm  in  order  to  achieve  optimality  was  between  0.7%  and  11%  of 
the  total  scenario  set  size. 
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GUROBI 

MOSEK 

n 

N 

[SpOCP] 

[SpCOP-SD] 

[SpOCP] 

[SpCOP-SD] 

50 

1000 

2.58 

2.73 

0.18 

0.63 

2500 

10.63 

6.61 

0.49 

0.96 

5000 

32.01 

19.27 

1.06 

1.70 

10000 

87.27 

41.34 

2.31 

3.49 

25000 

198.56 

92.39 

7.14 

6.70 

50000 

455.63 

540.09 

16.36 

13.70 

100000 

1217.96 

2080.34 

35.33 

30.29 

100 

1000 

7.16 

3.14 

0.30 

0.75 

2500 

29.47 

8.44 

0.85 

1.37 

5000 

90.25 

19.74 

1.88 

2.32 

10000 

277.72 

44.31 

4.52 

3.91 

25000 

642.63 

92.11 

12.66 

8.66 

50000 

1365.37 

1716.37 

28.64 

15.10 

100000 

— 

— 

65.48 

28.29 

200 

1000 

17.86 

3.87 

0.69 

1.01 

2500 

78.28 

8.65 

1.90 

1.56 

5000 

276.89 

22.40 

4.41 

2.47 

10000 

799.65 

49.02 

9.88 

4.84 

25000 

2118.11 

107.14 

29.99 

9.60 

50000 

— 

— 

64.52 

17.41 

100000 

— 

— 

139.87 

34.99 

Table  2:  Average  computation  times  (in  seconds)  obtained  by  solving  problems  [SpCOP]  and  [SpCOP-SD]  for 
p  =  3  using  GUROBI  and  MOSEK.  All  running  times  are  averaged  over  20  instances  and  symbol  “ — ”  indicates 
that  the  time  limit  of  3600  seconds  was  exceeded. 


5  Conclusions 

In  this  work,  we  propose  an  efficient  algorithm  for  solving  large-scale  convex  stochastic  programming 
problems  that  involve  a  class  of  risk  functionals  in  the  form  of  infimal  convolutions  of  certainty  equiv¬ 
alents.  We  exploit  the  property  induced  by  such  risk  functionals  that  a  significant  portion  of  scenarios 
is  not  required  to  obtain  an  optimal  solution.  The  developed  scenario  decomposition  technique  is  con¬ 
tingent  on  the  identification  and  separation  of  “non-redundant”  scenarios  by  solving  a  series  of  smaller 
relaxation  problems.  It  is  shown  that  the  number  of  iterations  of  the  algorithm  is  bounded  by  the  number 
of  scenarios  in  the  problem.  Numerical  experiments  with  portfolio  optimization  problems  based  on  sim¬ 
ulated  return  data  following  the  covariance  structure  of  randomly  chosen  S&P500  stocks  demonstrate 
that  significant  reductions  in  solution  times  may  be  achieved  by  employing  the  proposed  algorithm.  Par¬ 
ticularly.  performance  improvements  were  observed  for  the  large-scale  instances  when  using  HMCR 
measures  with  p  —  2,3,  and  LogExpCR  measures. 
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CPLEX 

GUROBI 

MOSEK 

n 

N 

[LpOCP] 

[LpOCP-SD] 

[LpOCP] 

[LpOCP-SD] 

[LpOCP] 

[LpOCP-SD] 

50 

1000 

0.27 

0.12 

0.22 

0.59 

0.82 

0.46 

2500 

1.65 

0.24 

0.74 

0.83 

4.26 

0.66 

5000 

6.81 

0.46 

2.31 

1.54 

15.08 

1.46 

10000 

19.20 

1.42 

7.73 

3.86 

60.66 

3.75 

25000 

31.93 

3.93 

56.52 

13.74 

381.67 

11.34 

50000 

179.49 

16.07 

117.72 

36.51 

1412.81 

25.47 

100000 

903.36 

62.79 

474.68 

112.72 

— 

54.45 

too 

1000 

0.37 

0.13 

0.23 

0.61 

2.94 

0.65 

2500 

2.22 

0.28 

0.86 

0.98 

7.11 

1.06 

5000 

8.58 

0.79 

2.82 

1.76 

32.20 

1.95 

10000 

28.71 

2.18 

9.28 

4.13 

122.75 

4.99 

25000 

45.37 

4.99 

35.11 

13.13 

1138.99 

15.34 

50000 

200.12 

18.80 

122.21 

39.78 

2753.54 

34.17 

100000 

3336.26 

82.79 

1316.29 

138.74 

— 

80.15 

200 

1000 

0.61 

0.20 

0.33 

0.89 

15.68 

1.06 

2500 

3.13 

0.44 

1.30 

1.17 

20.64 

1.37 

5000 

13.25 

1.01 

3.72 

2.11 

70.49 

2.97 

10000 

47.97 

3.31 

13.20 

4.72 

322.36 

8.12 

25000 

195.28 

6.98 

94.45 

14.77 

2418.52 

26.91 

50000 

936.60 

27.20 

665.61 

45.43 

— 

53.62 

100000 

— 

114.08 

3301.44 

160.92 

— 

123.89 

Table  3:  Average  computation  times  (in  seconds)  obtained  by  solving  problems  [LpOCP]  and  [LpOCP-SD]  for 
p  =  3  using  CPLEX,  GUROBI  and  MOSEK.  All  running  times  are  averaged  over  20  instances  and  symbol  “ — ” 
indicates  that  the  time  limit  of  3600  seconds  was  exceeded. 
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n 

N 

[LogExpCP] 

MOSEK 

[LogExpCP-SD] 

Instances  Solved 
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0.61 

0.27 

12 

2500 

0.97 
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MOSEK 


n  N 

[SOCP-SD] 

[SpOCP-SD] 

[LpOCP-SD] 

[LogExpCP-SD] 

50  1000 

80.3 

24.8 

21.3 

61.8 
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47.8 

47.0 

77.8 

5000 
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80.3 

79.0 

104.6 
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1834.9 

232.0 

318.3 
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675.0 

841.7 

100000 
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1447.5 
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87.2 

32.0 

27.0 

81.4 
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191.2 

73.6 

74.1 

107.8 
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367.6 

107.4 
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192.2 
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148.9 

156.9 

229.7 
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278.1 

348.6 

1869.1 

50000 

3802.9 

457.8 

729.7 

2418.6 

100000 

7323.3 

831.3 

1395.8 

923.4 

200  1000 

108.2 

39.5 

36.4 

100.7 

2500 

201.7 

72.7 

73.0 

154.5 

5000 

395.6 

116.3 

119.6 

198.1 

10000 

744.0 

184.9 

171.2 

304.6 

25000 

1805.5 

308.3 

347.0 

464.2 

50000 

3607.8 

512.2 

697.6 

788.1 

100000 

7198.9 

865.0 

1384.3 

1153.5 

Table  5:  Average  number  of  partitioned  scenarios  from  solving  the  scenario  decomposition-based  problems  listed 
in  Section  4.1  and  4.2. 
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Abstract  A  new  two-stage  stochastic  partial  differential 
equation  (PDE)-constrained  optimization  methodology  is 
developed  for  the  active  vibration  control  of  structures  in  the 
presence  of  uncertainties  in  mechanical  loads.  The  method¬ 
ology  relies  on  the  two-stage  stochastic  optimization  for¬ 
mulation  with  an  embedded  first-order  black-box  PDE- 
constrained  optimization  procedure.  The  PDE-constrained 
optimization  procedure  utilizes  a  first-order  active-set  algo¬ 
rithm  with  a  conjugate  gradient  method.  The  objective  func¬ 
tion  is  determined  through  solution  of  the  governing  PDEs 
and  its  gradient  is  computed  using  automatic  differentia¬ 
tion  with  hyper-dual  numbers.  The  developed  optimization 
methodology  is  applied  to  the  problem  of  post-impact  vibra¬ 
tion  control  (via  applied  electromagnetic  field)  of  an  elec¬ 
trically  conductive  carbon  fiber  reinforced  composite  plate 
subjected  to  an  uncertain,  or  stochastic,  impact  load.  The 
corresponding  governing  PDEs  consist  of  a  nonlinear  cou¬ 
pled  system  of  equations  of  motion  and  Maxwell’s  equa¬ 
tions.  The  conducted  computational  study  shows  that  the 
obtained  two-stage  optimization  solution  allows  for  a  sig- 
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nificant  suppression  of  vibrations  caused  by  the  randomized 
impact  load  in  all  impact  load  scenarios.  Also,  the  effective¬ 
ness  of  the  developed  methodology  is  illustrated  in  the  case 
of  a  deterministic  impact  load,  where  the  two-stage  strategy 
enables  one  to  practically  eliminate  post-impact  vibrations. 

Keywords  PDE-constrained  optimization  ■  two-stage 
stochastic  optimization  •  electro-magneto-mechanical 
coupling  •  composite  materials 

1  Introduction 

In  electrically  conductive  solids,  mechanical  and  electro¬ 
magnetic  fields  interact  through  the  Lorentz  ponderomotive 
force  that  is  exerted  by  the  electromagnetic  field.  Analy¬ 
sis  of  this  field  interaction  requires  simultaneous  solution 
of  Maxwell’s  equations  for  electromagnetic  field  (Maugin, 
1988)  and  equations  of  motion  of  continuous  media  that  in¬ 
volve  the  Lorentz  force  as  a  body  force,  whereby  the  system 
of  governing  equations  becomes  coupled  and  nonlinear.  This 
field  coupling  leads  to  many  interesting  effects  observed  in 
the  mechanical  behavior  of  the  electrically  conductive  solids 
subjected  to  electromagnetic  load,  including  changes  in  the 
stress  state  (Moon,  1984;  Zhupanska  and  Sierakowski,  2007, 
2011;  Higuchi  et  al,  2007),  vibration  behavior  (Barakati 
and  Zhupanska,  2012a;  Rudnicki,  2002),  and  unusual  sta¬ 
bility  behavior  (Hasanyan  and  Piliposyan,  2001;  Hasanyan 
et  al,  2006;  Eringen,  1989).  Electro-magneto-mechanical 
coupling  can  potentially  lead  to  the  development  of  struc¬ 
tures  amenable  to  active  control  by  the  electromagnetic  field. 
Interactions  between  mechanical,  electromagnetic,  and  ther¬ 
mal  fields  provide  a  basis  for  the  multifunctional  materials 
and  structures. 

Composite  materials  are  often  considered  to  be  materi¬ 
als  of  choice  for  multifunctional  applications  (Gibson,  2010) 
due  to  their  multiphase  nature  and  inherent  tailorability. 
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As  a  result,  the  recent  years  witnessed  a  growing  interest 
in  electro-magneto-mechanical  interactions  in  composites. 
Most  of  the  studies  have  been  focused  on  the  mechanics, 
while  less  attention  was  paid  to  the  optimization  of  mul¬ 
tifunctional  composites  and  structures.  The  present  work 
makes  contribution  to  the  latter  subject. 

The  present  work  is  closely  related  to  the  recent  stud¬ 
ies  on  the  electro-magneto-mechanical  coupling  in  elec¬ 
trically  conductive  anisotropic  composites  (Zhupanska  and 
Sierakowski,  2007,  2011;  Barakati  and  Zhupanska,  2012a,b, 
2013,  2014),  where  the  effects  of  the  steady,  slowly  vary¬ 
ing,  and  pulsed  electromagnetic  fields  on  the  mechanical  re¬ 
sponse  of  single-layer  and  laminated  anisotropic  compos¬ 
ite  plates  were  examined.  The  interacting  effects  of  the  ap¬ 
plied  electric  current,  external  magnetic  field,  and  mechani¬ 
cal  load  were  studied.  It  has  been  shown  that  the  characteris¬ 
tics  of  the  electromagnetic  field  (waveform,  duration  of  ap¬ 
plication,  intensity)  can  significantly  reduce  the  stressed  and 
deformed  states  of  the  electrically  conductive  plate  and  de¬ 
crease  the  amplitude  of  vibrations.  In  particular,  to  achieve 
the  maximum  reduction  in  the  plate  deflection  and  stress, 
the  application  of  the  mechanical  load  must  be  coordinated 
with  application  of  the  electric  current  and  its  waveform. 
Moreover,  an  increase  in  the  magnetic  induction  tends  to  re¬ 
duce  the  amplitude  of  vibrations  of  the  plate  with  a  trend 
towards  a  more  rapid  decay  at  the  stronger  magnetic  fields. 
An  increase  in  the  electric  current  density  tends  to  decrease 
the  amplitude  of  the  plate  vibrations.  Furthermore,  the  effect 
of  the  electric  current  density  becomes  more  pronounced  as 
the  magnetic  field  intensity  increases.  It  has  been  concluded 
that  concurrent  application  of  a  pulsed  electromagnetic  load 
could  effectively  mitigate  the  effects  of  the  impact  load  and 
post-impact  vibrations. 


1.1  Active  vibration  control  of  a  composite  plate  via  an 
electromagnetic  field:  A  conceptual  application 

The  results  of  the  previously  discussed  studies  provided  mo¬ 
tivation  for  the  present  work  on  a  stochastic  partial  dif¬ 
ferential  equation  (PDE)-constrained  optimization  approach 
to  active  control  of  the  mechanical  response  of  the  elec¬ 
trically  conductive  composites,  using  an  electromagnetic 
field.  As  a  specific  application,  we  consider  the  problem 
of  vibration  control  -  via  application  of  an  electromagnetic 
field  -  in  an  electrically  conductive  carbon  fiber  reinforced 
polymer  (CFRP)  composite  plate  subjected  to  a  mechani¬ 
cal  impact  load  with  uncertain  parameters  (magnitude,  dura¬ 
tion,  etc).  We  hypothesize  that  electromagnetically  activated 
CFRP  structural  elements  could  provide  additional  protec¬ 
tion  against  certain  types  of  foreign  object  impacts,  assum¬ 
ing  that  an  appropriate  sensor  technology  can  be  employed 
for  applying  an  electromagnetic  field  to  the  composite  struc¬ 


ture  at  the  moment  of  impact  so  as  to  increase  the  impact 
resistance  and  dampen  post-impact  vibrations. 

The  practical  viability  of  this  hypothetical  scenario  de¬ 
pends  on  a  number  of  factors,  among  which  are  the  avail¬ 
ability  of  (i)  composite  materials  with  necessary  mechanical 
and  electromagnetic  properties,  (ii)  adequate  sensors  to  trig¬ 
ger  application  of  an  electromagnetic  field,  and  (iii)  ability 
to  adjust  and  control  characteristics  of  the  applied  electro¬ 
magnetic  field  (i.e.,  waveform,  duration  of  application,  in¬ 
tensity)  depending  on  the  target  composite  material  char¬ 
acteristics  and  applied  impact  load.  Physics-based  models 
of  electro-magneto-mechanical  coupling  in  electrically  con¬ 
ductive  composites  can  provide  theoretical  underpinnings 
for  the  development  of  the  electromagnetically  activated  im¬ 
pact  resistant  structural  elements,  while  PDE-constrained 
stochastic  optimization  can  provide  a  path  to  the  active  con¬ 
trol  of  these  structural  elements  in  the  presence  of  uncertain¬ 
ties. 

In  Section  2  we  outline  the  physical  model  of  the  field 
coupling  phenomenon  that  is  exploited  in  this  work.  Since 
the  general  model  is  prohibitively  complex,  a  high-fidelity 
approximation  of  the  governing  equations  in  the  case  of  thin 
composite  plates  is  discussed.  In  Section  2.2  we  introduce 
the  actual  boundary-value  problem  corresponding  to  the  im¬ 
pact  of  a  thin  CFRP  composite  plate  in  a  deterministic  set¬ 
ting,  i.e.  when  the  impact  load  is  known  with  certainty.  This 
problem  forms  the  basis  for  the  stochastic  PDE-constrained 
optimization  problem  that  is  introduced  in  Section  3.  Nu¬ 
merical  solution  and  optimization  procedures  are  discussed 
in  Section  4,  and  in  Section  5  we  present  the  results  of  com¬ 
putational  studies. 

2  Mechanics  of  electro-magneto-mechanical 
interactions  in  electrically  conductive  anisotropic 
composite  plates 

In  this  section  we  first  outline  the  governing  equations  for 
anisotropic  electrically  conductive  solids  subjected  to  me¬ 
chanical  and  electromagnetic  loads.  Then,  we  discuss  a 
2D  plate  approximation,  as  well  as  the  resulting  2D  non¬ 
linear  hyperbolic -parabolic  system  of  PDEs  that  constitute 
the  mathematical  framework  for  solving  problems  of  the 
dynamic  mechanical  response  of  the  anisotropic  electri¬ 
cally  conductive  plates  subjected  to  mechanical  and  elec¬ 
tromagnetic  loads.  See  Zhupanska  and  Sierakowski  (2007); 
Barakati  and  Zhupanska  (2012a)  for  details. 

2.1  Governing  equations 

The  behavior  and  interaction  of  the  mechanical  and  elec¬ 
tromagnetic  fields  in  electrically  conductive  solids  can  be 
determined  from  simultaneously  solving  the  equations  of 
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motion  that  include  the  Lorentz  ponderomotive  force  and 


Maxwell’s  equations: 

<92u 

V.T  +  P(F  +  F‘)  =  P3?, 

(1) 

divD  =  P(,,  divB  =  0, 

<9B  .  dD 

rotE  =  —  ,  rotH=j  +  ^— . 

at  at 

(2) 

Here  T  is  the  stress  tensor,  u  is  the  displacement  vector, 
p  is  density,  F  is  the  body  force  per  unit  mass,  FL  is  the 
Lorentz  force  per  unit  mass,  and  V  is  the  gradient  operator. 
In  Maxwell’s  equations  (2),  D  represents  the  electric  dis¬ 
placement  vector,  B  is  the  magnetic  induction,  E  is  the  elec¬ 
tric  field,  H  is  the  magnetic  field,  j  is  the  current  density 
vector,  pe  is  the  electric  charge  density  (which  vanishes  in 
electric  conductors),  and  t  is  time. 

Interaction  between  mechanical  and  electromagnetic 
fields  in  the  electrically  conductive  materials  is  due  to  the 
Lorentz  force,  FL,  that  enters  equations  of  motion  (1)  as 
a  body  force.  It  has  been  shown  in  Zhupanska  and  Sier- 
akowski  (2007)  that  the  Lorentz  force  in  the  electrically  con¬ 
ductive  anisotropic  solids  takes  the  form: 

P^=P,(e+^xb)  +  (.(e+^xb))xB  ^ 

+  (((e-a>I)E)  xB)av(^)a+Jf  xB> 

where  a  is  the  electrical  conductivity  tensor,  e  is  the  electri¬ 
cal  permittivity  tensor,  £o  is  the  electrical  permittivity  in  the 
vacuum,  V  is  the  gradient  operator,  and  Einstein’s  summa¬ 
tion  convention  is  adopted  with  respect  to  index  a.  There¬ 
fore,  the  system  of  equations  (l)-(2)  is  the  system  of  nonlin¬ 
ear  hyperbolic  PDEs  that  represent  the  governing  equations 
of  electro-magneto-mechanical  coupling  in  electrically  con¬ 
ductive  solids.  The  nonlinearity  is  due  to  the  presence  of  the 
Lorentz  force,  which  contains  nonlinear  terms  with  respect 
to  the  components  of  the  mechanical  and  electromagnetic 
fields.  In  the  most  general  dynamic  case,  the  problem  of 
solving  the  system  of  governing  equations  (l)-(2)  for  solids 
of  even  the  simplest  3D  geometries  is  insurmountable.  In 
many  situations,  however,  solution  of  equations  (l)-(2)  can 
be  facilitated  through  appropriate  physics-based  hypothe¬ 
ses,  or  simplifications  that  allow  one  to  reduce  mathemat¬ 
ical  complexity  of  the  model  while  preserving  its  physical 
fidelity  by  exploiting  particular  features  of  problem’s  geom¬ 
etry,  etc. 

With  respect  to  the  present  work,  a  2D  approximation 
for  the  thin  electrically  conductive  plates  subjected  to  me¬ 
chanical  and  electromagnetic  loads  is  used.  This  approxima¬ 
tion  was  developed  in  Zhupanska  and  Sierakowski  (2007) 
and  utilizes  Kirchhoff  hypothesis  of  non-deformable  nor¬ 
mals  and  electromagnetic  hypotheses. 


Next,  we  briefly  outline  the  procedure  to  derive  2D  ap¬ 
proximation  of  the  governing  equations.  More  details  can 
be  found  in  Zhupanska  and  Sierakowski  (2007);  Barakati 
and  Zhupanska  (2012a).  As  for  the  mechanical  part  of  the 
governing  equations  (1),  the  linear  plate  theory  formula¬ 
tion  based  on  the  so-called  Kirchhoff  hypothesis  of  non- 
deformable  normals  is  used.  Equations  of  motion  with  re¬ 
spect  to  stress  and  moment  resultants  are  obtained  by  inte¬ 
gration  of  (1)  across  the  thickness  of  the  plate.  In  contrast 
to  the  problems  with  purely  mechanical  load,  application  of 
the  Kirchhoff  hypothesis  and  integration  of  the  3D  equations 
of  motion  through  the  thickness  of  the  plate  does  not  pro¬ 
duce  2D  equations  of  motion.  This  is  due  to  the  presence  of 
the  terms  with  the  Lorentz  force  components,  which  remain 
three-dimensional.  Therefore,  to  obtain  a  2D  approximation 
to  the  equations  of  motion,  one  needs  to  derive  a  2D  ap¬ 
proximation  for  the  electromagnetic  field  and  the  Lorentz 
force  for  the  case  of  thin  plates.  This  is  achieved  by  intro¬ 
ducing  additional  hypotheses  regarding  the  behavior  of  the 
electromagnetic  field  components,  which  imply  that  tangen¬ 
tial  components  of  the  electric  field  vector  and  the  normal 
component  of  the  magnetic  field  vector  do  not  change  across 
the  thickness  of  the  plate  and  the  variation  of  the  tangential 
components  of  the  magnetic  field  across  the  thickness  of  the 
plate  is  linear.  A  2D  approximation  of  Maxwell’s  equations 

(2)  is  obtained  by  representing  functions  H,  E,  and  J  via  se¬ 
ries  expansions  with  respect  to  the  coordinate  z,  integrating 
Maxwell’s  3D  equations  across  the  thickness  of  the  plate  and 
invoking  a  quasistatic  approximation  for  Maxwell’s  equa¬ 
tions.  The  2D  expression  for  the  Lorentz  force  is  obtained 

(3)  using  the  Kirchhoff  hypothesis  for  the  plate  displace¬ 
ments  and  the  set  of  the  discussed  electromagnetic  hypothe¬ 
ses.  The  2D  equations  of  motion  are  then  obtained  by  inte¬ 
grating  the  terms  with  the  Lorentz  force  across  the  thickness 
of  the  plate  in  the  equations  of  motion  with  respect  to  the 
stress  and  moment  resultants. 

Finally,  2D  equations  of  motion  and  2D  Maxwell’s  equa¬ 
tions  constitute  the  system  of  governing  equations  for  a  me¬ 
chanically  and  electrically  conductive  plate  subjected  to  me¬ 
chanical  and  electromagnetic  loads  and  correspond  to  the 
linear  plate  theory.  This  system  of  equations  is  a  nonlinear 
mixed  system  of  parabolic  and  hyperbolic  PDEs. 


2.2  Impact  problem:  A  deterministic  formulation 

In  this  section  we  present  the  boundary-value  problem  for 
a  thin  anisotropic  composite  plate  subject  to  a  determinis¬ 
tic  mechanical  impact  load  and  electromagnetic  field  within 
the  mathematical  framework  presented  in  the  previous  sub¬ 
section.  Such  a  deterministic  formulation  was  considered  in 
Barakati  and  Zhupanska  (2012a)  and  forms  the  basis  for  the 
stochastic  model  with  uncertain  impact  loads  and  the  corre- 
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sponding  stochastic  optimization  formulations  will  be  intro¬ 
duced  in  Section  3. 

Consider  a  thin  unidirectional  fiber-reinforced  (x- 

direction  is  the  fiber  direction)  electrically  conductive  com¬ 
posite  plate  of  width  a  and  thickness  h  subjected  to  the  trans¬ 
verse  impact  load: 

p(x,y,f)  =  [0,0 ,pz{x,y,t)\ ,  (4) 

time-dependent  electric  current  of  density: 

JO)  =  [/*(*), 0,0],  (5) 

and  immersed  in  the  constant  magnetic  field  with  the  induc¬ 
tion: 

B*  =  [0,2?*, 0] .  (6) 

It  is  assumed  that  the  intensity  of  the  current  is  such  that  the 
associated  thermal  effects  are  negligible. 

The  plate  is  transversely  isotropic,  where  y-z  is  the  plane 
of  isotropy  and  the  x-y  plane  coincides  with  the  middle 
plane  of  the  plate.  The  plate  is  assumed  to  be  long  in  the  fiber 
direction,  which  is  also  the  direction  of  the  applied  current 
(x-direction),  simply  supported  along  the  long  sides,  arbi¬ 
trarily  supported  along  the  short  sides  (see  Figure  1),  and 
initially  is  at  rest. 


Fig.  1  Composite  plate  subjected  to  impact  and  electromagnetic  loads 


The  corresponding  mechanical  and  electromagnetic 
boundary  conditions  are: 


-h  =  -pz(y,t), 


uy\ —  “z| y=±f  —  Myy\y=±% 


/  dw  .  dv 


=  0, 


r=-2 


=  0. 


y=i 


(7a) 

(7b) 

(7c) 


The  applied  transverse  impact  load  (4)  causes  vibrations 
in  the  plate,  which  can  potentially  be  mitigated  by  appli¬ 
cation  of  the  external  electromagnetic  field  consisting  of 
the  electric  current  of  density  (5)  and  magnetic  induction 
(6).  We  are  interested  in  the  optimal  characteristics  of  the 
electromagnetic  field  to  maximally  reduce  mechanical  vi¬ 
brations  caused  by  the  impact  load. 

The  formulated  problem  (4)-(7)  for  a  long  transversely 
isotropic  plate  admits  the  assumption  of  independence  of  the 


components  of  mechanical  and  electromagnetic  fields  of  the 
coordinate  x,  which  using  the  procedure  described  in  Sec¬ 
tion  2  reduces  the  governing  equations  (1)  and  (2)  to  the 
form: 


1  dNy 


yy 


d2\ 


,  dv 


h  dy 


=  Pjt 2+  °*Bz  T7  -  VxByBz  TTT  + 


dw  ,  ex  -  £() 


dt 


dt 


B22 


EXBZ 


dN \ 


dW 


-g-  -  (e.v  -  £0 )ExB*y^+BzJx(t)  +  axExBz 

_  vv  ,  FZ\Xl‘)  _  <5';  .  _  tn*\2^W 

-  P  +  —  axBxBz  —  +  C7V  (By )  -57 


1  dNyz  =od2w  PzM 

h  dy  dt 2  h 


y>  dt 


dw 


-  (e,  -  £o)ExBz—  —B*Jx(t)  -  axExB*y, 


dM , 


yy 


ph 3 d2W 


dy 


12  dt 2 


+  Nyz  -  —aJr’Bz 


12 


dW 

dt 
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&x  £0 


dM, 


B22 

dv  1 
dy  I1B22 
dB - 


Nyy>  3, ,2 


d2W 
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12  dw 

h2B22Myy' 


dy 


dw 


—x  =  axp  Ex+^-Bz-^-B;  ,  • 


dt 


dt  y 


dEy  dBy 


dy  dt 


(8) 


Here  v  and  w  are  the  middle  plane  displacement  components 
in  y-  and  "-directions,  respectively;  Nyy  =  % yydz  and 

Nyz  =  Tyzdz  are  the  stress  resultants;  Myy  =  f^/2  xyyZdz 
is  the  moment  resultant;  Ex  is  the  x-component  of  the  electric 
field;  B-  is  the  z-component  of  the  magnetic  induction;  ox 
and  £x  are  the  electrical  conductivity  and  permittivity  in  x- 
direction,  respectively;  p  is  the  magnetic  permeability;  and 
B22  =  E2/(l  —  V12V21),  where  E2  is  Young’s  modulus  along 
the  y-direction,  V12,  and  V21  are  the  corresponding  Poisson 
ratios. 

The  system  of  the  nonlinear  PDEs  (8)  represents  the 
governing  equations  in  the  context  of  this  work.  The  formu¬ 
lated  deterministic  boundary-value  problem  (7)— (8)  for  low- 
velocity  impact  of  a  thin  composite  plate  in  the  presence  of 
an  electromagnetic  field  forms  the  basis  for  the  stochastic 
PDE-constrained  optimization  model  of  optimal  vibration 
mitigation  that  is  presented  in  Section  3. 


3  A  two-stage  stochastic  PDE-constrained  optimization 
framework 

In  this  section  we  first  introduce  a  deterministic  PDE- 
constrained  optimization  problem  for  vibration  reduction  in 
composite  plates  using  an  electromagnetic  field,  which  is 
followed  by  the  more  general  two-stage  stochastic  PDE- 
constrained  optimization  framework  for  control  of  compos¬ 
ite  structures  in  the  presence  of  uncertainties  in  mechanical 
loads. 
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3.1  A  PDE-constrained  optimization  formulation 

The  existence  of  field  coupling  effects  between  mechanical 
and  electromagnetic  fields  in  electrically  conductive  solids 
presents  an  opportunity  for  controlling  and/or  optimizing  the 
mechanical  response  of  the  corresponding  structures  via  ap¬ 
plication  of  an  electromagnetic  field.  Assuming  that  the  de¬ 
sign  or  performance  criterion  of  the  structural  element  can 
be  expressed  through  some  function  F  to  be  minimized,  the 
problem  of  optimization  or  control  of  the  mechanical  field 
via  electromagnetic  field  generally  reduces  to  a  (nonlinear) 
PDE-constrained  optimization  problem  of  the  form: 


min  Fy  Oir](g,0,£)  (9a) 

0 

'e|0'n  <9b) 

G(g,|f)  ±a=o,  t  e  [o,r],  (9c) 

e<e<e,  (9d) 


where  vector  g  represents  the  components  of  the  mechan¬ 
ical  field,  i.e.,  displacements  and  stress  and  moment  resul¬ 
tants  in  the  governing  equations  (8),  vector  9  contains  the 
parameters  of  the  electromagnetic  field,  vector  £  denotes  the 
parameters  of  the  mechanical  load,  FpTj(g,9,^)  is  the  de¬ 
sign/performance  fitness  function  of  the  structure  that  is  ob¬ 
served  during  time  interval  [0,T],  constraints  (9b)  and  (9c) 
represent  the  system  of  governing  PDEs  (8)  with  boundary 
conditions  (7),  respectively,  and  9  and  9  are  the  lower  and 
upper  bounds  for  the  vector  of  control  variables  9.  Note  that 
for  the  sake  of  simplicity,  we  omit  the  explicit  dependency 
of  g  on  the  time  variable  t. 

As  the  purpose  of  our  optimization  problem  is  to  min¬ 
imize  the  post-impact  vibrations  of  the  plate,  the  optimiza¬ 
tion  criterion  F  in  (9)  is  defined  as  the  average  squared  mid¬ 
dle  plane  displacement  of  the  plate: 

F[0,T](g,9,0  =  ( wc(9,tt))2dt ,  (10) 

where  wc  =  w|  0  is  the  middle  plane  displacement  at  the 
center  of  the  plate. 


3.2  A  two-stage  stochastic  programming  formulation 

It  can  be  readily  seen  that  the  optimal  parameters  of  the  elec¬ 
tromagnetic  field  as  a  solution  of  problem  (9)  depend  heav¬ 
ily  on  the  parameters  of  the  applied  impact  load.  Since  the 
impact  load  can  rarely  be  predicted  or  estimated  with  suf¬ 
ficient  accuracy,  in  this  subsection  we  discuss  a  stochastic 
extension  of  the  general  problem  (9)  under  the  assumption 
that  the  impact  load  is  uncertain,  or  random. 


To  deal  with  the  uncertainty  in  the  parameters  of  the  im¬ 
pact  load,  we  resort  to  the  two-stage  stochastic  optimization 
framework.  In  general,  the  discipline  of  stochastic  optimiza¬ 
tion  is  concerned  with  determining  optimal  decision  poli¬ 
cies  in  situations  when  the  decision  making  process  is  in¬ 
fluenced  by  uncertainties  in  problem  data  (Prekopa,  1995; 
Birge  and  Louveaux,  1997;  Kail  and  Mayer,  2005;  Shapiro 
et  al,  2009).  One  of  the  main  assumptions  within  this  frame¬ 
work  is  that  the  uncertain  parameters  can  be  described 
probabilistically  as  random  variables  from  some  probabil¬ 
ity  space  (£2  P),  where  £2  is  the  set  of  random  events, 

&  is  the  sigma-algebra,  and  P  is  the  probability  measure. 
In  other  words,  while  the  values  of  the  uncertain  parameters 
cannot  be  predicted  with  high  degree  of  certainty,  their  prob¬ 
ability  distributions  are  believed  to  be  known.  The  second 
assumption  that  is  prevalent  in  most  of  stochastic  optimiza¬ 
tion  literature  is  that  the  probability  distributions  in  question 
are  finite  (|i2|  <  °°),  and  uncertainty  in  any  given  param¬ 
eter  c,  can  be  described  by  a  finite  set  of  possible  realiza¬ 
tions  4  ((On),  or  “scenarios” ,  with  each  realiza¬ 

tion  (scenario)  ft),  e  £2  having  a  prescribed  non-zero  proba¬ 
bility  P(ft),)  >0. 


The  two-stage  stochastic  optimization  framework  mod¬ 
els  the  situation  when  the  decision-making  process  under 
uncertainty  involves  two  decisions,  or  actions:  the  initial,  or 
first  stage  decision/action,  and  a  subsequent  corrective,  or 
recourse ,  or  second  stage  decision/action.  Namely,  the  first- 
stage  action  is  selected  under  uncertainty,  i.e.,  before  the 
actual  realizations  of  the  uncertain  factors  can  be  observed. 
After  the  first-stage  decision  has  been  made,  it  is  assumed 
that  one  can  observe  the  actual  realized  values  of  the  prob¬ 
lem’s  uncertain  parameters  as  well  as  their  effect  on  the  out¬ 
come  of  that  decision  (e.g.,  a  person  must  place  a  bet  in  a 
horse  race  before  its  start;  then  the  outcome  of  the  race  and 
the  bet  determine  the  winnings,  if  any). 


Clearly,  in  most  cases  the  first-stage  action  will  not  be 
optimally  suited  for  any  given  realization  of  uncertainty.  The 
second-stage,  or  recourse  decision/action  is  made  after  the 
particular  realization  of  uncertainties  was  observed,  and  its 
purpose  is  to  correct  the  consequences  of  the  first-stage  ac¬ 
tion  with  respect  to  the  actual  observed  outcome  of  uncer¬ 
tainty.  It  is  important  to  emphasize  that  the  second-stage 
decision  is  dependent  on  the  observed  realization  of  uncer¬ 
tainties  and  the  first-stage  decision;  in  turn,  the  first-stage 
decision  must  take  into  account  the  probability  distribution 
of  uncertainties  and  the  corresponding  second-stage  actions 
(for  example,  a  poorly  chosen  first-stage  action  may  not  al¬ 
low  for  any  feasible  corrective  actions). 
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Mathematically,  a  two-stage  stochastic  optimization 
problem  can  be  written  in  the  form: 


optimization  problem  that  minimizes  the  plate’s  expected 
deflections  can  be  formulated  as: 


min  Effl(/i(x,©)+/2(x,y(©),©)) 
s.  t.  hi(x,©)<0,  V©ef2.  (11) 

h2(x,y(©),©)  <  0,  V©ef2. 

Here,  x  denotes  the  vector  of  first-stage  decisions  and 
y  =  y(©)  denotes  second-stage  decision;  note  that  we  ex¬ 
plicitly  indicate  its  dependence  on  the  random  element  ft) 
from  the  set  £2  of  all  possible  random  events.  Function 
/i(x,ft>)  denotes  the  first-stage  design/decision  criterion, 
and  /2(x,y(ft)},0))  denotes  the  corresponding  criterion  for 
the  second-stage  action.  Similarly,  hi(x,ft>)  <  0  represents 
the  first-stage  constraints  to  be  satisfied  by  the  first-stage  de¬ 
cision  x,  and  the  next  constraint  stipulates  that  the  second- 
stage  constraints  to  be  satisfied  by  the  second-stage  decision 
y(ffl)  may  depend  explicitly  on  first-stage  decision  x  and  the 
observed  realization  of  ft).  An  optimal  solution  of  (1 1)  deliv¬ 
ers  the  best,  on  average,  value  of  the  first-  and  second-stage 
design  criteria. 

With  respect  to  the  problem  of  impact  of  a  composite 
plate  that  was  discussed  in  Section  2.2,  we  consider  that  the 
vector  £  of  parameters  that  describe  the  mechanical  impact 
load  pz{t)  =  pz{t\£)  is  random,  £  =  £(©),  with  a  known  dis¬ 
tribution.  Probability  space  £2  is  finite  and  describes  a  finite 
number  of  scenarios,  £2  =  { fi>i , . . . ,  ©a?},  where  each  sce¬ 
nario  ft),  corresponds  to  a  specific  vector  of  parameters  £(©,) 
of  the  impact  load,  and  the  probabilities  P(©,)  of  random  el¬ 
ements  ft),-  £  £2  are  known.  The  discrete  scenarios  may  rep¬ 
resent,  for  example,  different  types  of  foreign  objects  that 
may  strike  the  composite  plate. 

It  is  assumed  that  the  actual  realization  of  the  parame¬ 
ters  of  impact  load,  £  =  £(©/.)  for  some  ft)/,  £  £2,  becomes 
known  (observable)  after  a  certain  time  Tq  (for  example,  an 
appropriate  sensor  technology  can  be  employed  to  estimate 
the  impact  load  during  the  impact  event).  The  decision  on 
the  choice  of  control  parameters  G  must  be  made  at  or  prior 
to  t  =  0,  before  the  actual  realization  £  of  the  mechanical 
load  can  be  observed.  After  time  To,  we  have  an  opportunity 
for  a  corrective  (recourse)  action,  which  consists  in  adjust¬ 
ing  the  electromagnetic  field  so  as  to  address  the  mismatch 
between  the  first-stage  decision  and  the  actual  observation 
of  uncertain  parameters  in  the  best  way  possible. 

Specifically,  during  the  first  stage  one  applies  an  elec¬ 
tromagnetic  field  with  pre-computed  parameters  G  so  as  to 
minimize  the  expected  vibrations  during  the  time  period 
t  £  [0, 7o] .  It  is  assumed  that  during  this  time  interval  the  pro¬ 
file  of  the  mechanical  load  can  be  observed  and  identified, 
which  allows  for  a  subsequent  correction  G'  =  G'  (co)  of  the 
original  selection  G,  where  we  again  explicitly  indicate  that 
the  second-stage  action  G'  depends  on  the  observed  realiza¬ 
tion  co  £  £ 2.  Then,  the  two-stage  stochastic  PDE-constrained 


min  Ea 

e.o1 


(Fm](g(co),G4(co))  (12) 

+  jF[7o,t-i]  (g'(co),6>/(co),^(ftj))N 


s.  t. 

dg  (CO) 

dy 


=  &  (  g,  If,  )  ,  t  £  [0,7b],  V©  £  £2 , 


G  g(ffl), 


dg(©)\ 
dt  ) 


=  0,  t  £  [0,7b],  V©  €  £2, 


y=±l 


dg  '(co) 

dy 


j  <5g'  <5V  a, 


=  *[g',^,^t,e\coU(co)), 


G  g'(ffl), 


dg'(co) 


dt 


t£  [7o,7j],  V©ef2, 

=  0,  t£  [7b, Ti],  V©  £  £2 , 


y=±S 


§lr=r0  8  \t= 


Jg 

V  dt 


t=T0 


dl 

dt 


,  V©  £  £2, 


t=T0 


6  <  G,G'(co)  <  G,  V©  £  £2. 


Note  the  explicit  dependence  of  vectors  g(ffl),  g '(©),  G'(co), 
and  £(©)  on  the  random  element  co  £  £ 2.  The  first  term 
in  the  objective  function  of  problem  (12)  corresponds  to 
the  first  stage,  when  the  parameters  of  the  problem  £(©) 
are  uncertain  with  a  known  discrete  distribution.  During 
this  stage,  an  electromagnetic  field  characterized  by  vector 
of  parameters  G  is  applied  to  minimize  the  expected  value 
of  T^o, r0]  (g(©), #,£(©)),  the  average  squared  middle  plane 
displacement  at  the  center  of  the  plate  during  time  interval 
[0, 7o].  The  first  two  constraints  in  (12)  stipulate  that  the  gov¬ 
erning  PDEs  (8)  and  boundary  conditions  (7)  must  hold  at 
t  £  [0, 7o]  for  any  of  the  possible  impact  load  scenarios. 

The  second  term  in  the  objective  of  (12)  represents  the 
average  squared  middle  plane  displacement  at  the  center  of 
the  plate  during  the  second  stage,  from  /  =  To  to  t  =  7), 
which  depends  explicitly  on  the  second-stage  action  G'(co) 
and  implicitly  on  the  preceding  first  stage  action  G,  by  means 
of  the  continuity  conditions  that  are  given  as  the  fifth  line 
of  constraints  in  (12).  The  values  of  vector  g  during  time 
interval  [7o,  Tj]  are  denoted  as  g\  and  the  third  and  fourth 
constraints  of  (12)  require  that  the  governing  equations  and 
boundary  conditions  hold  during  [7o,  ]  for  all  scenarios 

co  £  £ 2.  The  fifth  line  of  constraints  (12)  represents  the  conti¬ 
nuity  conditions  at  t  To  for  the  first-and  second-stage  me¬ 
chanical  fields  g  and  g'. 

The  two-stage  stochastic  PDE-constrained  optimization 
problem  (12)  formalizes  the  proposed  approach  to  control  of 
mechanical  structures  under  uncertainties  with  respect  to  the 
considered  problem  of  impact  of  a  composite  plate.  Clearly, 
the  proposed  framework  allows  for  obvious  generalizations. 
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In  the  remainder  of  the  paper  we  discuss  the  numerical  solu¬ 
tion  procedures  for  problem  (12)  as  well  as  physical  viability 
of  its  solutions. 


4  Numerical  solution  and  optimization  methods 

In  this  section  we  discuss  the  basic  steps  of  solution  proce¬ 
dure  for  the  two-stage  stochastic  PDE-constrained  problem 
(12)  in  the  case  of  an  impacted  composite  plate  as  presented 
in  Section  2.2. 


4.1  Numerical  solution  of  the  governing  system  of  PDEs 


Presence  of  a  system  of  nonlinear  PDEs  as  constraints  in 
problem  (12)  necessitates  effective  solution  methods  for  the 
respective  PDEs  in  order  to  solve  (12).  With  respect  to  the 
specific  boundary  value  problem  for  the  plate  subjected  to 
impact  and  electromagnetic  loads,  we  employ  the  methods 
proposed  in  Zhupanska  and  Sierakowski  (2007);  Barakati 
and  Zhupanska  (2012a).  For  the  sake  of  completeness  of  the 
exposition,  we  outline  the  key  points  of  the  corresponding 
solution  procedure  below. 

The  system  of  nonlinear  governing  PDEs  (8)  that  enters 
the  two-stage  PDE-constrained  problem  (12)  can  be  rewrit¬ 
ten  in  the  form; 


<5g 

dy 


=  $ 


dg  djg 
dt  ’  dt 2 


(13) 


where  g  =  g (x,y,t,0)  is  a  vector  of  variables  g  = 
[v,w,W,Nyy,Nyz,Myy,Ex,Bz],  is  a  nonlinear  function  from 
(8),  and  6  is  the  optimization  variable,  i.e.,  the  vector  con¬ 
taining  the  parameters  of  the  electromagnetic  field  (to  be  de¬ 
fined  in  Section  5). 

A  numerical  solution  procedure  for  this  systems  consists 
of  a  sequential  application  of  a  finite  difference  time  integra¬ 
tion,  quasilinearization  of  the  resulting  system  of  the  nonlin¬ 
ear  ordinary  differential  equations  (ODEs),  and  a  finite  dif¬ 
ference  spatial  integration  of  the  obtained  two-point  bound¬ 
ary  value  problem.  The  first  step  is  to  discretize  (13)  with 
respect  to  time  t  by  applying  Newmark  finite  difference  time 
integration  scheme  (Newmark,  1959).  This  reduces  (13)  to 
the  nonlinear  two-point  boundary  problem  for  the  system  of 
ODEs: 


-^=*i(g,y,e,S),  (14) 

This  system  is  solved  at  discrete  moments  of  time  with 
timestep  At  by  using  a  quasilinearization  method  of  Bell¬ 
man  and  Kalaba  (1965).  This  method  allows  for  substituting 


the  solution  of  (14)  with  a  sequential  solution  of  a  linearized 
system  with  linearized  boundary  conditions: 


^i(gk,y,e^)+A(gk,y,e^)(gk+l 


dgj 


Dt  (g*)  gi+1  (yo)  =  di  (g*) , 

D2(g*)gi+1(}7v)  =  d2(g*), 


05) 


where  gi+1  and  gk  are  the  solutions  on  the  current  and  previ¬ 
ous  iteration  steps.  A  good  choice  for  the  initial  guess  g°  is  a 
solution  from  the  previous  time  step.  Points  yo  and  }'N  corre¬ 
spond  to  the  edges  of  the  plate,  matrices  D,  (gA )  and  vectors 
d/(gA),  i  =  1,2,  are  derived  from  the  boundary  conditions 
at  y  =  yo  and  y  =  y/v-  The  sequence  {gA+1 }  of  the  solutions 
of  the  system  (15)  quickly  converges  to  the  solution  of  the 
nonlinear  system  and  the  stopping  criterion  for  the  iterative 
procedure  is: 


«?+1/«f-l 


<5, 


(16) 


where  8  >  0  is  the  prescribed  accuracy. 

To  solve  the  system  of  linear  ODEs  in  (15)  we  employ 
the  superposition  method  (Atkinson  et  al,  2009).  If  M  is  the 
dimensionality  of  the  system  and  there  are  M/2  boundary 
conditions  on  both  the  left  (y  =  yo)  and  right  (y  =  y #)  ends, 
then  we  may  represent  the  solution  of  the  system  of  the  lin¬ 
ear  ODEs  by  a  linear  combination  of  M/2  linearly  indepen¬ 
dent  general  solutions  of  the  homogeneous  system  and  one 
particular  solution  of  the  inhomogeneous  system: 


M/2 

g*+1(y)=IcMy)  +  G^(y), 


7=1 


(17) 


where  Cj  are  the  linear  coefficients.  The  values  of  G7,  j  = 
1 .....  if  f  1  are  obtained  on  the  left  end  from  the  bound¬ 
ary  conditions  and  then  are  propagated  to  the  right  end  with 
the  aid  of  the  fourth-order  Runge-Kutta  method.  At  the  right 
end  the  linear  coefficients  cj  can  be  found  from  the  bound¬ 
ary  conditions  by  solving  a  system  of  linear  algebraic  equa¬ 
tions.  In  order  to  guarantee  that  vectors  G7  are  independent, 
and  therefore  coefficients  Cj  are  uniquely  determined  at  the 
right  end,  an  orthonormalization  procedure  is  employed  af¬ 
ter  each  iteration  of  the  Runge-Kutta  method.  The  corre¬ 
sponding  transformation  matrices  are  then  used  to  restore 
the  coefficients  Cj. 


4.2  PDE-constrained  optimization  framework 

The  existing  approaches  to  PDE-constrained  optimization 
problems  can  generally  be  categorized  into  two  groups  (Her- 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


D.  Chernikov  et  al. 


zog  and  Kunisch,  2010).  The  first  group  of  methods  fits  un¬ 
der  the  umbrella  of  “black  box”  optimization.  This  frame¬ 
work  implies  that  one  is  able  to  obtain  certain  informa¬ 
tion  about  the  objective  function,  which  usually  includes  the 
value  of  the  function  for  any  given  feasible  point,  its  gra¬ 
dient  and,  perhaps,  its  higher  order  derivatives  (depending 
on  which  optimization  algorithm  is  employed)  at  that  point. 
This  information  is  then  used  to  further  direct  the  search  for 
an  optimal  solution.  It  must  be  emphasized,  however,  that 
PDE  constraints  are  embedded  in  the  computation  of  the 
objective  and  its  gradient  and  thus  need  to  be  satisfied  at 
every  step  of  the  algorithm,  which  potentially  makes  this 
approach  computationally  expensive.  For  example,  in  the 
present  work  the  value  of  the  objective  function  is  obtained 
by  numerically  solving  a  system  of  nonlinear  PDEs  using 
the  procedure  described  in  Section  4.1. 

An  alternative  “discretize-then-optimize”  approach  con¬ 
sists  in,  first,  discretizing  the  system  of  PDEs  and  replac¬ 
ing  the  PDE  constraints  in  the  problem  with  the  resulting 
discretizations,  often  in  the  form  of  linear  constraints.  This 
generally  leads  to  improved  computational  efficiency,  as  the 
system  of  governing  PDEs  is  not  required  to  be  solved  at 
every  step.  On  the  other  hand,  this  method  is  not  applicable 
to  every  type  of  PDE-constrained  problem;  for  example,  in 
our  case  the  governing  system  of  nonlinear  PDEs  cannot  be 
solved  by  straightforward  discretization. 

The  specifics  of  our  particular  problem  dictates  the  use 
of  black-box  first-order  optimization  procedure,  which  can 
be  summarized  as  follows:  (i)  compute  the  objective  func¬ 
tion  by  solving  the  governing  system  of  PDEs  numerically; 
(ii)  compute  first-order  information,  i.e.,  the  gradient  of  the 
objective  function  at  the  current  feasible  point;  (iii)  apply  a 
first-order  optimization  algorithm. 

The  value  of  objective  function  in  (12)  depends  on  the 
solution  of  a  system  of  PDEs,  which  makes  analytical  com¬ 
putation  of  its  gradient  impractical.  To  this  end,  numerical 
differentiation  techniques,  such  as  complex  differentiation 
(Squire  and  Trapp,  1998)  or  some  version  of  automatic  dif¬ 
ferentiation  (Rail,  1986)  can  be  employed. 

4.3  Numerical  differentiation 

The  proposed  solution  approach  for  two-stage  stochas¬ 
tic  PDE-constrained  optimization  problem  (12)  is  based 
on  first-order  methods  and  requires  computation  of  the 
gradient  of  the  objective  function  at  a  given  feasible 
point.  Specifically,  we  are  interested  in  the  full  deriva¬ 
tives  of  fjo^]  (g(®),0,£(<a))  with  respect  to  6  and 
%0Ti](g'(®)^,(®)^(®))  with  respect  to  [0,  9'{co)] .  The 
function  F  itself  has  a  quite  simple  structure,  however,  g  and 
g'  are  implicitly  dependent  on  parameters  6  and/or  O'(co),  as 
they  are  coupled  through  the  system  of  governing  equations 
(8).  Next  in  this  subsection  we  will  not  distinguish  between 


6  and  O'(co)  and  refer  to  them  as  a  single  vector  of  param¬ 
eters  6  that  is  used  as  an  input  to  the  system  of  governing 
equations. 

There  exists  a  number  of  methods  for  numerically  com¬ 
puting  a  derivative  of  a  function,  among  which  are  finite- 
difference  method,  adjoint  method,  complex  differentiation, 
automatic  (algorithmic)  differentiation.  In  our  work,  we  use 
the  method  which  is  closely  related  to  both  complex  and  au¬ 
tomatic  differentiation. 

Complex  differentiation  method  (Squire  and  Trapp, 
1998;  Martins  et  al,  2003,  2001)  is  applicable  in  case  of  an 
analytic  function  of  a  real  variable.  Instead  of  taking  a  small 
step  in  the  direction  of  the  real  axis,  as  is  customary  in  finite 
difference  methods,  a  small  increment  is  considered  in  the 
direction  of  the  imaginary  axis: 

f(x  +  is)  =  f(x)  +  isf'{x)  -  S—  f"(x)  -  iSj[f"'(x)  +  0{s4). 

If  s  is  small  enough,  by  computing  f(x  +  is)  one  can  obtain 
approximations  to  the  values  of  f(x)  and  f'(x): 

f(x)  =Ref(x+is)  +  0(s2),  f'(x)  =  lmf^  +  ls^  +  o(s2). 

As  it  can  be  readily  seen,  the  complex  differentiation  method 
offers  a  significant  improvement  in  accuracy  comparing  to 
the  traditional  finite-difference  approach  at  a  relatively  low 
computational  overhead,  as  there  is  no  a  subtraction  cance¬ 
lation  error.  In  practice,  it  allows  for  fast  and  stable  compu¬ 
tation  of  derivatives  at  almost  machine  precision.  However, 
in  the  multivariate  case,  /  =  /(x),  x  €  Mm,  one  would  have 
to  evaluate  /(x  +  ise^),  where  is  the  k- th  orthant  in  Km, 
for  each  k  =  1 , . . . ,  m,  in  order  to  compute  the  gradient  of 
/  at  the  point  x.  This  obviously  increases  significantly  the 
computational  effort  for  evaluation  of  the  gradient  of  /(x). 
Alternative  methods  for  numerical  differentiation  of  multi¬ 
variate  functions  that  are  based  on  the  the  same  principle 
employ  various  generalizations  of  complex  numbers. 

Existing  generalizations  of  complex  numbers  rely  on 
different  definitions  of  the  imaginary  unit.  One  of  such  gen¬ 
eralizations  is  represented  by  dual  numbers  (Kantor  and 
Solodovnikov,  1989;  Piponi,  2004)  of  the  form  a  +  r]b, 
where  77  is  the  dual  unit ,  77  /  0.  77 2  =  0.  Similarly,  hyper¬ 
dual  numbers  have  the  form  a  =  ao  +  T]\ai  - \-T]mam  with 

m  imaginary  dual  parts  T],  such  that  77,77,  =  0  for  all  i.j.  The 
arithmetic  operations  with  hyper-dual  numbers  are  defined 
as  follows: 

a  +  b  =  ao  +  bo  +  T]i(ai  +  b\)  -\ —  •  +  T]m{am  +  bm), 

ab  =  aobo  +  r)i  (a\bQ  +  aob\)  4 - h  rjm(aobm  +  a,„bo), 

a/b=  (aobo  +  ri\(aibo~aobi)  + ...  (18) 

+  1]m  )amb  0  tl()biN  )  j  //?q  . 
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Then,  given  a  multivariate  function  f(x\ , .  . .  ,xm),  each 
of  its  m  arguments  can  be  represented  as  a  hyper-dual  num¬ 
ber  with  m  imaginary  parts.  More  precisely,  let  variable  x,- 
at  a  given  point  x-1  be  represented  by  a  hyper-dual  num¬ 
ber  whose  real  part  is  equal  to  x®  and  all  imaginary  parts 
are  set  to  zero,  with  the  exception  of  the  z-th  imaginary  part 
which  is  set  to  1.  It  can  be  shown  that  upon  application  of 
the  above  hyper-dual  arithmetic  rules  ( 1 8)  for  computation 
of  the  (hyper-dual)  value  of  /,  one  obtains  that  the  real  part 
of  the  result  is  equal  to  /(xj, . . .  ,xj}),  and  the  z-th  imaginary 
part  is  equal  to  [df/dxi)  |x_xo-  As  an  illustration,  consider 

r,  x  C*'i  +*2)x\ 

/(x  i,x2  = - . 

xi 

To  find  and  7^f(xvx2)’  let  xi  =  X®  +  r/il  + 

V2O,  X2  =  x?  +  17 1 0  +  r/2 1 .  Then 


f(*i,y  1)  = 


x?(x?+x«) 


'll 


2x®  - 


=  f(x°i,X2)  +  'll  ^r/(Xi,X?)  +  ?72^/(x?,X?). 


The  described  technique  is,  in  fact,  a  forward  mode  of 
automatic  differentiation  (Rail,  1986),  when  derivative  in¬ 
formation  is  propagated  forward  with  the  computations  ac¬ 
cording  the  the  differentiation  chain  rule.  There  are  different 
variations  of  this  framework;  more  discussion  of  automatic 
differentiation  with  hyper-dual  numbers  can  be  found  in  Rail 
(1986);  Piponi  (2004);  Fike  and  Alonso  (2011). 

In  our  case  we  need  the  full  derivative  of 
^[O,7b](gW>0>€)  with  respect  to  6.  The  structure  of  F 
itself  is  quite  simple  and  dF /d6  can  be  found  analytically. 
The  biggest  difficulty  is  to  find  the  derivative  of  g  with 
respect  to  6.  To  do  this,  the  governing  system  of  equations 
(8)  is  solved  numerically  using  hyper-dual  numbers.  The 
imaginary  dual  parts  of  all  the  input  parameters  except  6 
are  set  to  zero,  while  the  z-th  component  of  vector  6  has  the 
form  0,  =  0(°  + 1 7,,  where  0(°  is  the  corresponding  numerical 
value  of  the  input  parameter.  The  imaginary  dual  parts  of 
the  resulting  hyper-dual  values  of  the  components  of  vector 
g  then  represent  the  sought  partial  derivatives  of  g. 


4.4  Optimization  methods 

Having  computed  the  value  and  gradient  of  the  objective 
function,  we  are  now  in  a  position  to  apply  a  first-order 
optimization  scheme.  In  this  study,  we  employed  the  active 
set  method  due  to  Hager  and  Zhang  (2006).  The  algorithm 
consists  of  the  nonmonotone  gradient  projection  scheme 
and  regular  unconstrained  conjugate  gradient  method  and 
switches  between  them  under  certain  conditions.  We  will 
outline  the  ideas  of  both  of  these  methods  and  how  they  are 


connected.  More  details  on  the  active  set  algorithm  includ¬ 
ing  convergence  analysis  can  be  found  in  Hager  and  Zhang 
(2006,  2005). 

Nonmonotone  gradient  projection  algorithm  (NGPA) 
can  be  applied  to  the  so-called  “box-constrained"  optimiza¬ 
tion  problems  of  the  form: 

min  {/(x)  :  1  <  x  <  u}. 

Let  us  denote  the  feasible  set  of  this  problem  as  0  =  {x  g 
R"  :  1  <  x  <  u},  and  define  P(x)  as  the  projection  of  a  point 
in  R"  on  0: 

P(x)  =  argmin||x  — y||. 
ye© 

If  X£  £  0  is  the  current  iterate,  we  compute  x'k  =  xk  — 
where  q/.  is  the  gradient  of  the  objective  function  /  at  xk 
and  ak  is  the  corresponding  step  length.  The  point  x',  can  be 
infeasible,  so  its  projection  P(x'k)  on  the  feasible  set  is  com¬ 
puted.  By  using  a  nonmonotone  line  search  in  the  direction 
of  the  vector  d*  =  P(x})  —  xk,  a  new  iterate  x*+i  is  found. 

For  unconstrained  optimization  problems,  a  conjugate 
gradient  method  can  be  used.  Its  main  principle  is  that  ev¬ 
ery  step  is  made  in  the  direction  of  steepest  descent  which  is 
corrected  by  previous  direction  multiplied  by  some  (3 : 

x*+i  =  xk  +  Skdk,  &k+\  =  — qzfc+i  +  Pk  do  =  —  qo, 

where  8,  is  the  step  length  chosen  by  inexact  line  search.  In 
our  work,  the  following  conjugate  gradient  method  by  Hager 
and  Zhang  (2005)  is  used: 

Pk  =  max{ftA',  77a},  /If  =  (xk-  2dk gzt+t , 

dTXi.  y  d'xA  J 

-1 

Tlk  ||d*||min  {77,  ||q*||} ' 

The  nonmonotone  gradient  projection  algorithm  is  glob¬ 
ally  convergent  and  in  theory  can  deal  with  box-constrained 
optimization  quite  well.  However,  in  practice  its  speed  of 
convergence  can  be  slow  near  a  local  minimizes  At  the  same 
time,  the  conjugate  gradient  method  often  has  superlinear 
convergence  for  unconstrained  optimization  problems.  The 
active  set  algorithm  takes  advantage  of  both  these  methods 
by  using  NGPA  to  determine  active  constraints  (faces  of  the 
feasible  set  0,  containing  current  iterate  xk).  Then,  the  con¬ 
jugate  gradient  method  is  used  to  optimize  over  that  face. 


4.5  Solution  procedure 

Now,  knowing  all  the  main  components  of  the  solution  pro¬ 
cedure,  we  can  assemble  them  together  to  show  how  the 
problem  is  solved.  Since  the  system  of  governing  equations 
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is  solved  numerically  by  discretization,  we  modify  appropri¬ 
ately  expression  (10)  of  the  optimization  criterion.  Namely, 
assuming  that  the  discretization  time  step  At  is  sufficiently 
small,  the  integral  in  (10)  can  be  approximated  by: 

1  T/At  9 

*io,n(g.0,O  -  E  M*, $,**))  .  (19> 

1  lm  k=  1 

where  the  values  of  wc  are  taken  at  time  instants  4  =  k At. 
Note  also  that  the  constant  factor  At  in  the  above  summa¬ 
tion  can  be  disregarded  in  the  optimization  problem  since 
it  is  present  as  a  constant  scaling  factor  in  the  objectives  of 
optimization  problems  (9)  and  (12). 

The  system  of  governing  equations  is  solved  using 
hyper-dual  arithmetic  to  obtain  the  derivative  of  wc  with  re¬ 
spect  to  6  for  each  time  step  4.  Knowing  all  the  derivatives 
'jq  ,  an  approximation  of  the  derivatives  /g.  Pqj]  (§■$•£) 
can  be  found  using  the  standard  chain  rule  in  (19).  Given  that 
the  distribution  of  stochastic  factors  £  =  £(fi))  in  the  two- 
stage  stochastic  PDE-constrained  optimization  problem  (12) 
is  assumed  to  be  discrete  with  a  finite  support,  and  therefore 
can  be  modeled  by  a  finite  scenario  set  £2  =  {fi)i , . . . ,  Wa?}, 
where  P(ft),)  >  0  and  P(®<)  =  1,  problem  (12)  can  be 
presented  in  the  following  form 

ee'ja) )  Lp(co‘')(F[o,ro](g(^)-6,.^(®0)  (20a) 

+  ^[Tb ,  7-i  ]  (§'  ( ) ,  6>'  (  0?/ ) ,  ^  ( ft?,- ) )  ) 

s.  t. 


(20b) 


(20c) 


(20d) 


(20e) 


V/e{l,...,Af},  (20f) 

0  <  0,  0'((Oi)  <  0,  ViG  {1,...,/V}. 

To  find  the  first  group  of  summands  of  the  objec¬ 
tive  (20a),  Tjo j-0]  (g(ft),),0,  £(©,)),  and  their  partial  deriva¬ 
tives  w.r.t.  0 ,  the  boundary-value  problem  (20b)-(20c)  is 


G  g(fflk-), 


dg (Oj) 

dt 


dg\co,) 

dy 


G  g'(fflt), 


dt  ’  dt 2 

/  €  [0,7b],  ViG  {1,. 

=  0, 

y=±  f 

t  e  [0,7o],  Vie  {1,...,/V}> 

t  e  [7b, 7)],  Vi e  {i,...,iv}, 

=  0, 

y=±| 

t£[To,Ti],  Vi  e  {1,. 


dg  '(co,: 

dt 


g(®i)  t=T  g  (®i)  t=Ta  ’ 


dg((Oi) 


V 


dg'(coi) 


solved  numerically  for  each  ft),-  e  £2  using  hyper-dual  num¬ 
bers.  For  the  second  group  of  components  of  the  ob¬ 
jective,  fr[ro,r1](g'(®i),0'(®i))€(®r)).  the  boundary-value 
problem  (20d)-(20e)  must  be  solved  and  the  continuity 
conditions  (20f)  must  be  satisfied.  Note  that  g'  implic¬ 
itly  depends  on  6 ,  and  thus  in  the  gradient  of  g'  there 
are  twice  as  many  components  as  in  the  gradient  of  g. 
In  practice,  to  take  into  account  this  implicit  dependence 
and  continuity  conditions  in  computing  the  value  and  gra¬ 
dient  of  F[7j)  7-]  (g (tOj)  system  (20b)-(20e) 

is  solved  for  t  G  [0,7j],  using  hyper-dual  numbers  for  each 
ft),  G  £2,  with  control  parameters  being  switched  from  G  to 
G\(Oj)  at  time  Tq.  Then,  the  value  of  F  and  its  derivatives, 
are  computed  according  to  (19)  with  first  Tq/ At  terms  being 
ignored. 

In  order  to  perform  optimization  step  of  the  active  set 
algorithm,  two  systems  of  PDEs  (20b,  20d)  with  boundary 
conditions  (20c,  20e)  in  the  constraints  are  solved  in  hyper¬ 
dual  numbers  for  each  ft),  G  £2  per  above.  This  enables  one  to 
compute  the  value  and  gradient  of  objective  function  (20a). 
The  outlined  computational  procedure  was  implemented  in 
C++  programming  language. 


5  Numerical  results 

In  this  section  we  report  optimization  results  for  a  single¬ 
layer,  transversely  isotropic  (x-axis  is  the  axis  of  material 
symmetry  and  y-z  is  the  plane  of  isotropy)  carbon  fiber  rein¬ 
forced  composite  plate  of  width  a  =  0.1524  m  and  thickness 
h  =  0.0021  m.  Elastic  properties  of  the  composite  plate  are 
as  follows:  Young’s  modulus  in  the  fiber  direction  is  F\  = 
102.97  GPa,  Young’s  modulus  in  the  transverse  direction  is 
£2  =  7.55  GPa,  Poisson’s  ratios  are  V21  =  V13  =  0.3,  density 
of  the  composite  is  p  =  1594  kg/m3,  electrical  conductivity 
in  the  fiber  direction  is  <T]  =  39000  S/m  and  electrical  per¬ 
mittivity  in  the  fiber  direction  is  £1  =  2.5015  x  10- 10  F/m. 
The  plate  is  subjected  to  a  transverse  impact  load  (4)  at  the 
initial  time  moment,  t  =  0.  Simultaneously,  an  electromag¬ 
netic  load  is  applied  and  consists  of  the  time-dependent  elec¬ 
tric  current  applied  in  the  fiber  direction  (5)  and  constant 
in-plane  magnetic  field  applied  in  the  direction  perpendicu¬ 
lar  to  the  electric  current  (6)  (see  Figure  1).  Application  of 
the  electromagnetic  load  is  expected  to  mitigate  the  effects 
of  the  mechanical  impact  by  maximally  reducing  the  post¬ 
impact  vibrations  of  the  plate. 

The  (randomized)  applied  impact  load  p  (4)  has  the  fol¬ 
lowing  profile,  where  the  maximum  impact  pressure  po  and 
the  impact  duration  xp  are  uncertain  parameters,  which  is 
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Scenario,  co 

Probability,  P(a>) 

po  ( 0)) ,  MPa 

Zp(co),  ms 

0)1 

1/3 

7.5 

8.0 

ox 

1/3 

10.0 

10.0 

0)3 

1/3 

20.0 

12.0 

Table  1  Scenario  realizations  of  the  maximum  impact  pressure  po  and 
impact  duration  zp  of  the  mechanical  impact  load  (21). 


indicated  by  their  dependence  on  a  random  event  co  £  Q  : 


and  the  density  J(f)  of  the  time-dependent  electric  current 
(5)  applied  in  the  fiber  direction  is 

Jx(t)  =  Joe~,/Xe  sin  —  ,  Jy(t)  =  Jz(t)  =  0,  (23) 

Ts 

where  Jo ,  Ze,  and  Tv  are  the  parameters  determining  the  elec¬ 
tric  current  waveform,  i.e.,  the  maximum  current  density, 
fall  and  rise  times.  The  quantities  Jo,  Ze,  and  Ts  constitute 
the  vector  Q  of  decision  variables,  or  control  parameters: 


Px(y,t)  =  0,  py(y,t)  =  0, 

I  /  y\z 

~Po((0)\  1-  (  ;)  Si 
V  \bJ 


(21) 


PzM  =  \ 


2  .  7 It 

sm 

zp(co) 


\y\<b,  0<t<zp(co), 

b<\y\<“,  t>zp{(o). 


Here  b  =  0.01  h  is  the  width  of  the  impact  zone. 

In  such  a  way,  the  vector  £(o)  of  uncertain  parame¬ 
ters  in  the  two-stage  stochastic  PDE-constrained  optimiza¬ 
tion  problem  (12)  contains  po  and  z„: 


£(<»)  =  [po(ffl),Tp(ffly  • 


e=[Jo,ze,zs). 

It  is  worth  noting  that  the  magnitude  of  magnetic  field  By 
is  not  formally  included  in  the  vector  6,  and  is  fixed  at  the 
given  value  of  1  T  in  accordance  to  (22).  This  is  due  to  our 
observation  that  when  By  was  allowed  to  vary  within  a  pre¬ 
scribed  bounds  (the  so-called  “box  constraints”)  0  <By<  B , 
at  optimality  the  decision  variable  By  always  assumed  the 
maximum  possible  value,  By  =  B.  Hence,  for  simplicity  the 
value  of  By  was  fixed  as  in  (22).  The  rest  of  the  decision 
variables  were  box-constrained  as  follows: 

| Jo |  <  108  A/m2,  1(T5  s  <  zs,  ze  <  109  s,  (24) 


It  is  assumed  that  the  set  Q  of  random  events  contains 
three  equiprobable  elements,  or  scenarios: 

Q  =  {tDi, ©2, CO3 } ,  where  P(g>,-)  =  1/3,  i=  1,2,3. 

In  the  context  of  the  conceptual  application  described  in 
Section  1.1,  this  corresponds  to  the  composite  plate  being 
hit  at  random  by,  e.g.,  three  possible  types  of  foreign  ob¬ 
jects  or  projectiles.  Table  1  presents  the  numerical  values  of 
the  possible  realizations  of  the  maximum  impact  pressure 
and  impact  duration  of  the  impact  load  (21).  The  small  size 
of  the  scenario  set  is  chosen  specifically  for  the  illustrative 
purposes  of  our  computational  experiments;  in  practice,  re¬ 
alistic  descriptions  of  uncertainties  require  larger  scenario 
sets. 

The  duration  To  of  the  first  stage  was  set  at  To  =  10  ms, 
which  is  equal  to  the  average  duration  of  impact  in  the  con¬ 
sidered  scenarios.  This  reflects  our  assumptions  that  an  ap¬ 
propriate  sensory  technology  will  allow  for  estimating  the 
parameters  of  impact  load  during  the  impact  event  (see  Sec¬ 
tions  1.1  and  3.2).  The  total  duration  of  computational  time 
was  set  at  7)  =  50  ms.  According  to  the  two-stage  stochas¬ 
tic  framework  described  in  Section  3.2,  the  electromagnetic 
field  in  the  configuration  prescribed  by  the  first-stage  so¬ 
lution  is  applied  at  t  =  0.  At  t  =  To,  the  parameters  of  the 
electromagnetic  field  are  changed  as  dictated  by  the  second- 
stage  solution;  in  such  a  way,  the  durations  of  the  first  and 
second  stages  are  10  ms  and  40  ms,  respectively. 

The  parameters  of  the  magnetic  field  (6)  applied  to  the 
plate  are  as  follows: 

Bx  =  0,  By  =  B*  =  1.0  T,  Bz  =  0,  (22) 


where  the  prescribed  range  of  allowable  current  density  val¬ 
ues  was  chosen  so  as  to  eliminate  Joule  heating  considera¬ 
tions  (more  precisely,  to  ensure  that  the  thermal  effects  as¬ 
sociated  with  application  of  electric  current  are  negligible, 
see  Barakati  and  Zhupanska  (2012b)  for  an  in-depth  discus¬ 
sion  of  this  issue).  The  box  constraints  on  the  fall  and  rise 
times  ze  and  zs  are  selected  in  order  to  ensure  realistic  cur¬ 
rent  profiles  (as  in  the  case  of  the  lower  bound)  as  well  as 
to  avoid  numerical  difficulties  with  convergence  of  the  de¬ 
scribed  above  optimization  procedures  (as  in  the  case  of  the 
upper  bound). 

During  the  optimization  procedure,  the  initial  values  for 
both  first  and  second  stage  solution  vectors  6^  and  ^  (w), 
co  £  i 2,  were  chosen  as  follows:  Jo  =  1.0  x  106  A/m2,  Ts  = 
4.8  ms,  Ze  =  4.8  ms. 

The  optimal  solution  of  the  two-stage  stochastic  PDE- 
constrained  optimization  problem  (12)  (or  (20))  obtained 
during  the  described  above  solution  process  is  presented  in 
Table  2,  which  contains  the  parameters  (Jo,zs,Ze)  of  the 
waveform  (23)  of  electric  current  as  the  components  of 
the  first-stage  solution  vector  0  °  and  second-stage  vectors 
0  1 1  i  =  1,2,3.  The  corresponding  optimal  waveform 
profiles  of  the  electric  current  (23)  are  shown  in  Figure  2. 
Again,  we  emphasize  the  structure  of  the  obtained  two-stage 
stochastic  solution:  during  the  time  interval  [0,  To],  i.e.,  from 
t  =  0  until  t  =  10  ms,  the  optimal  first-stage  electric  current 
(Jo  =  1.81  x  106  A/m2,  zs  =  10.7  ms,  Ze  =  36.2  ms)  is  ap¬ 
plied  in  order  to  minimize  the  expected  plate  deflection  due 
to  an  uncertain  impact  load.  According  to  the  assumptions 
of  our  model,  the  parameters  of  the  actual  realization  of  the 
randomized  impact  load  (i.e.,  the  actual  observed  scenario) 
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Table  2  Optimal  parameters  of  the  electric  current  (23)  obtained  after 
solving  the  two-stage  stochastic  PDE-constrained  optimization  prob¬ 
lem  (12). 


Parameter 

First  stage 

Second  stage,  scenario 

ffi>i 

Ob 

Jo,  106  x  A/mz 

1.81 

100.0 

0.93 

100.0 

Ty,  ms 

10.7 

4.9 

7.4 

2.2 

Te,  ms 

36.2 

2.5 

0.01 

4.1 

become  known  by  time  t  =  Tq  =  10  ms,  and,  depending  on 
the  observed  scenario,  the  parameters  of  the  electric  current 
are  “switched”  at  t  =  Tq  to  the  corresponding  second-stage 
solution  values  so  as  to  minimize  post-impact  vibrations  of 
the  plate.  For  example,  if  it  is  determined  that  the  impact 
was  “light”,  i.e.,  an  impact  load  corresponding  to  scenario 
©i  was  observed  during  [0,7o],  then  at  t  =  Tq  the  parame¬ 
ters  of  the  electric  current  are  changed  to  Jq  =  108  A/m2, 
Ts  =  4.9  ms,  xe  =  2.5  ms. 

The  resulting  vibrations  of  the  plate  during  the  time  in¬ 
terval  [0,  Zi]  (i.e.,  from  0  to  50  ms)  are  displayed  for  each 
scenario,  along  with  the  corresponding  current  profile,  in 
Figure  3.  Note  that  in  all  three  subfigures  of  Figure  3,  the 
profile  of  the  electric  current  between  t  =  0  and  t  =  10  ms 
is  the  same  and  represents  the  first-stage  solution  (due  to 
the  differences  in  the  maximum  impact  pressure  across  the 
scenarios,  the  subfigures  use  different  scales  on  the  vertical 
axes).  It  is  also  of  interest  to  note  that  in  scenarios  ©j  and 
©2  the  electromagnetic  load  applied  during  the  first  stage  is 
such  that  it  causes  the  plate  to  deflect  in  the  direction  op¬ 
posite  to  the  direction  of  impact.  This  observation  is  also  in 
accord  with  the  formulated  model:  the  first  stage  solution 
minimizes  the  plate  deflection  “on  average”;  in  addition,  the 
magnitude  of  maximum  impact  presure  in  scenario  ©3  is  two 
to  almost  three  times  higher  than  those  in  scenarios  ©1  and 
Ob. 

Figure  4  presents,  for  each  of  the  three  scenarios,  the 
comparisons  of  the  plate’s  transverse  deflection  with  and 
without  application  of  the  (optimal)  electromagnetic  field.  It 
is  clear  that  the  constructed  two-stage  stochastic  optimiza¬ 
tion  solution  allows  for  significant  suppression  of  vibrations 
caused  by  uncertain  impact  load  in  all  three  scenarios.  It  can 
be  seen  from  Figure  4  that,  while  the  developed  two-stage 
model  and  the  corresponding  optimal  parameters  of  the  elec¬ 
tromagnetic  field  result  in  substantial  dampening  of  post¬ 
impact  vibrations,  the  vibrations  are  not  suppressed  com¬ 
pletely.  This  is  a  natural  consequence  of  the  fact  that  the 
impact  load  is  uncertain,  and  therefore  it  is  impossible  to 
provide  the  “best”  response  to  each  of  the  possible  scenar¬ 
ios. 

Next  we  illustrate  the  effectiveness  of  the  developed 
framework  in  the  situation  when  the  impact  load  is  known 
beforehand,  i.e.,  when  it  can  be  regarded  deterministic.  One 
can  expect  that  in  this  case  the  parameters  of  the  electromag- 


Fig.  3  Transverse  deflection  of  the  plate  and  optimal  electric  current 
waveforms  corresponding  to  different  impact  load  scenarios. 
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Fig.  4  Transverse  deflection  in  the  center  of  the  plate  vs.  time  for  dif¬ 
ferent  scenarios. 


netic  field  can  be  tuned  to  achieve  a  much  better  mitigation 
of  post-impact  effects  comparing  to  the  stochastic  case. 

In  particular,  we  assume  that  the  deterministic  impact 
load  has  the  same  parameters  as  the  load  of  scenario  ©2,  in 
accordance  to  expression  (21)  and  Table  1.  It  is  then  conve¬ 
nient  to  consider  that  the  stochastic  problem  is  solved  under 
the  assumption  that  all  scenarios,  except  ah,  are  impossible, 
i.e., 

P(fi)t)  =  0,  P(fi>2)  =  l,  P(®3)=0. 

This  implies  that  in  the  scenario-based  formulation  (20) 
of  the  two-stage  stochastic  PDE-constrained  optimization 
problem  (12)  the  terms  in  the  objective  function  that  cor¬ 
respond  to  scenarios  (0\  and  © 3  are  eliminated,  and,  in  ad¬ 


Table  3  Optimal  parameters  of  the  electric  current  in  the  deterministic 
case  when  impact  load  has  the  same  parameters  as  in  scenario  ah  of 
the  stochastic  case. 


Parameter 

First  stage 

Second  stage,  e>2 

Jo,  106  x  A/m2 

Ty,  ms 

Te,  ms 

1.46412 

10.0972 

124.662 

0.924654 

7.18217 

0.01 

Fig.  5  Transverse  deflection  of  the  plate  in  the  deterministic  case  that 
is  based  on  scenario  ah. 

dition,  the  constraints  that  enforce  satisfaction  of  the  PDE 
equations  and  boundary  conditions  in  scenarios  ©1 ,  ©3  are 
also  disregarded. 

Note,  however,  that  the  two-stage  structure  of  the  solu¬ 
tion  of  (20)  is  still  preserved,  which  means  that  at  t  =  7o 
the  parameters  of  the  applied  electric  current  are  allowed 
to  change.  In  other  words,  electric  currents  of  two  differ¬ 
ent  waveforms  determined  by  0  1 '  and  0rl-  (oh  )  are  applied 
during  time  intervals  [0,  7q]  and  [7q,  7\],  respectively.  Dur¬ 
ing  the  time  interval  [0, 7o],  electric  current  with  parameters 
given  by  the  first  stage  solution  () 1 1  is  used  to  optimally 
mitigate  the  impact  itself,  while  during  [7o,  7j]  the  electric 
current  with  parameters  $^(©2)  then  suppresses  the  post¬ 
impact  effects. 

With  exception  of  modifications  just  described,  the  rest 
of  the  parameters  of  the  problem  are  the  same  as  before.  The 
obtained  solution  of  this  deterministic  problem  is  given  in 
Table  3.  Figure  5  shows  the  transverse  middle  plane  deflec¬ 
tion,  wc,  at  the  center  of  the  plate,  y  =  0,  as  a  function  of  time 
for  the  cases  when  only  the  mechanical  load  is  present,  and 
when  the  optimal  electromagnetic  field  is  applied  as  well. 
It  is  easy  to  see  that  in  a  deterministic  setting  the  proposed 
framework  is  capable  of  practically  eliminating  the  vibra¬ 
tions. 

6  Conclusions 

In  this  work,  a  two-stage  stochastic  PDE-constrained  opti¬ 
mization  methodology  is  developed  for  the  active  vibration 
control  of  structures  in  the  presence  of  uncertainties  in  me¬ 
chanical  loads.  The  solution  methodology  includes  a  black- 
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box  first-order  optimization  procedure  embedded  in  the  two- 
stage  stochastic  optimization  formulation.  The  black-box 
first-order  optimization  procedure  consists  of  solving  a  sys¬ 
tem  of  governing  PDEs  and  automatic  differentiation  with 
hyper-dual  numbers  for  computing  the  objective  function 
and  its  gradient,  respectively;  and  applying  a  first-order 
active-set  algorithm  with  a  conjugate  gradient  method  for 
solving  the  optimization  problem. 

The  developed  optimization  methodology  is  applied  to 
the  problem  of  post-impact  vibration  control  (via  applied 
electromagnetic  filed)  of  an  electrically  conductive  carbon 
fiber  reinforced  composite  plate  subjected  to  an  uncertain, 
or  stochastic,  impact  load.  The  system  of  governing  PDEs 
describing  such  problem  consists  of  nonlinear  equations  of 
motion  and  Maxwell’s  equations.  The  randomized  impact 
load  applied  to  the  plate  is  comprised  of  three  equiproba- 
ble  scenarios  with  different  parameters  of  maximum  impact 
pressure  and  impact  duration.  Simultaneously,  according  to 
the  two-stage  stochastic  optimization  framework,  an  elec¬ 
tromagnetic  load  in  the  configuration  prescribed  by  the  first- 
stage  optimization  solution  is  applied  at  the  initial  moment 
of  time  and  is  changed  at  the  end  of  the  first  stage  as  dictated 
by  the  second-stage  optimization  solution.  The  electromag¬ 
netic  load  is  comprised  of  a  time-dependent  electric  current 
applied  in  the  fiber  direction  and  a  constant  in-plane  mag¬ 
netic  field  applied  in  the  direction  perpendicular  to  the  elec¬ 
tric  current.  Electric  current  waveform  characteristics  (i.e., 
the  maximum  current  density,  fall  and  rise  times)  consti¬ 
tute  the  vector  of  optimization  variables,  or  control  parame¬ 
ters.  The  optimal  solution  of  the  two-stage  stochastic  PDE- 
constrained  optimization  problem  represents  a  sequence  of 
actions,  where  the  first-stage  electric  current  waveform  is 
applied  at  the  moment  of  impact  without  knowing  the  actual 
impact  load  parameters;  the  second-stage  electric  current 
waveform  represents  a  corrective  action,  which  is  applied 
when  the  parameters  of  the  actual  impact  load  have  been 
observed/identified.  The  results  show  that  the  constructed 
two-stage  optimization  solution  allows  for  a  significant  sup¬ 
pression  of  vibrations  caused  by  the  randomized  impact  load 
in  all  impact  load  scenarios.  Lastly,  the  effectiveness  of  the 
developed  methodology  is  illustrated  in  the  case  of  a  deter¬ 
ministic  impact  load,  where  the  two-stage  strategy  enables 
one  to  practically  eliminate  post-impact  vibrations. 
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Abstract 

This  paper  discusses  the  use  of  polyhedral  approximations  in  solving  of  p-order  cone  programming  (pOCP) 
problems,  or  linear  problems  with  p-order  cone  constraints,  and  their  mixed-integer  extensions.  In  particular,  it  is 
shown  that  the  cutting-plane  technique  proposed  in  Krokhmal  and  Soberanis  (2010)  for  a  special  type  of  polyhe¬ 
dral  approximations  of  pOCP  problems,  which  allows  for  generation  of  cuts  in  a  constant  time  not  dependent  on 
the  accuracy  of  approximation,  is  applicable  to  a  larger  family  of  polyhedral  approximations.  We  also  show  that 
it  can  further  be  extended  to  form  an  exact  solution  method  for  pOCP  problems  with  0(s-1 )  iteration  complex¬ 
ity.  Moreover,  it  is  demonstrated  that  an  analogous  constant-time  cut  generating  algorithm  exists  for  recursively 
constructed  lifted  polyhedral  approximations  of  second-order  cones  due  to  Ben-Tal  and  Nemirovski  (2001).  It  is 
also  shown  that  the  developed  polyhedral  approximations  and  the  corresponding  cutting  plane  solution  methods 
can  be  efficiently  used  for  obtaining  exact  solutions  of  mixed-integer  pOCP  problems. 

Keywords:  /7-order  cone  programming,  second-order  cone  programming,  polyhedral  approximation,  cutting 
plane  methods,  mixed-integer  /7-order  cone  programming,  stochastic  programming,  portfolio  optimization. 


1  Introduction 


In  this  paper  we  consider  solving  linear  programming  problems  with  /7-ordcr  cone  constraints 


min  cTx 
s.  t.  Ax  <  b, 

II  C(fe)x  +  e(t)  ||  <  h(fc)Tx  +  f(k\  k  =  1, . . . ,  K, 

ii  ii  pk  —  J 

x  e  R", 

where  ||  •  ||_p  denotes  the  /7-norm  in 

(la'll ;  +  •  •  •  +  |a/v|p)  ,  P  £  \  1, oo), 

max  {  . |a/v|},  p  =  oo. 


(la) 

(lb) 

(lc) 


(2) 


We  call  formulation  (1)  a  p- order  cone  programming  problem  (pOCP)  by  analogy  with  second-order  cone  pro¬ 
gramming  (SOCP),  which  constitutes  a  special  case  of  (1)  when  />/  =  2  for  all  k  =  1 . K. 

Our  motivation  for  considering  problems  of  the  form  (1)  stems  from  risk-averse  optimization  under  uncertainty 
and  stochastic  programming,  where  use  of  certain  classes  of  risk  measures  leads  to  problems  with  /7-order  cone 
constraints;  see  Section  4.1  for  details. 
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The  available  literature  on  solving  problem  (1)  with  “general”  values  of  e  (1,  oo),  i.e.,  not  restricted  to  well- 
studied  special  cases  of  pk  =  1,  2,  or  oo,  is  relatively  limited.  Interior-point  approaches  to  /;-order  cone  program¬ 
ming  have  been  considered  by  Xue  and  Ye  [25]  with  respect  to  minimization  of  sum  of  p-norms;  a  self-concordant 
barrier  for  p-cone  has  also  been  introduced  by  Nesterov  [19].  Glineur  and  Terlaky  [12]  proposed  an  interior-point 
algorithm  along  with  the  corresponding  barrier  functions  for  a  related  problem  of  /p-norm  optimization  (see  also 
[21]).  A  polyhedral  approximation  approach  to  pOCP  problems  was  considered  by  Krokhmal  and  Soberanis  [15]. 
In  the  case  when  p  is  a  rational  number,  the  existing  primal-dual  methods  of  second-order  cone  programming  can 
be  employed  for  solving  /(-order  cone  optimization  problems  using  a  reduction  of  p- order  cone  constraints  to  a 
system  of  linear  and  second-order  cone  constraints  proposed  by  Nesterov  and  Nemirovski  [20]  and  Ben-Tal  and 
Nemirovski  [8],  see  also  Morenko  et  al.  [18], 

This  paper  represents  a  continuation  of  the  work  of  Krokhmal  and  Soberanis  [15]  on  polyhedral  approximation 
approaches  to  solving  pOCP  problems.  The  contribution  of  this  work  to  the  literature  consists  of  the  following: 
it  is  shown  that  the  cutting  plane  method  developed  in  [15]  for  solving  a  special  type  of  polyhedral  approxima¬ 
tions  of  pOCP  problems,  which  allows  for  generation  of  cuts  in  a  constant  time  not  dependent  on  the  accuracy  of 
approximation,  is  applicable  to  a  larger  family  of  polyhedral  approximations.  Further,  it  is  demonstrated  that  this 
constant-time  cut  generation  procedure  can  be  modified  so  as  constitute  an  exact  solution  method  with  <9(e_1) 
iteration  complexity.  Next,  we  present  a  constant-time  cut  generation  scheme  for  lifted  polyhedral  approximations 
of  SOCP  problems  due  to  Ben-Tal  and  Nemirovski  [9].  The  noteworthy  aspect  of  this  result  is  that  Ben-Tal  and 
Nemirovski’s  lifted  polyhedral  approximation  is  constructed  recursively,  with  the  length  of  recursion  controlling 
the  accuracy  of  approximation,  yet  the  cuts  can  be  generated  in  a  constant  time  that  does  not  depend  on  the  ac¬ 
curacy/recursion  length.  Finally,  we  illustrate  that  the  polyhedral  approximation  approach  and  the  corresponding 
cutting  plane  solution  methods  can  be  efficiently  employed  for  obtaining  exact  solutions  of  mixed-integer  exten¬ 
sions  of  pOCP  problems  (see  below). 

The  paper  is  organized  as  follows:  in  Section  2  we  discuss  the  general  properties  of  polyhedral  approximations 
of  p-cones.  Section  3.1  summarizes  the  general  cutting  plane  method  for  polyhedral  approximations  of  pOCP 
problems.  In  Sections  3.2  and  3.3  we  explore  fast  constant-time  cut  generating  techniques  for  gradient-based  and 
lifted  polyhedral  approximations  of  pOCP  and  SOCP  problems,  respectively.  The  developed  solution  techniques 
are  then  illustrated  on  pOCP  and  SOCP  problems  of  type  (1),  and  are  also  employed  for  solving  mixed-integer 


p -order  cone  programming  (MIpOCP)  problems 

min  cTx  +  dTz  (3a) 

s.  t.  Ax  +  Bz  <  b,  (3b) 

||c(Ox  +  dwz  +  eft) |  <  h(fe)Tx  +  g(«Tz  +  f(k\  k  =  1 . K,  (3c) 

xel",  z  €  Zm,  (3d) 


which  arise  in  the  context  of  portfolio  optimization  with  certain  risk  measures.  The  corresponding  discussion  is 
presented  in  Section  4. 


2  Polyhedral  approximations  of  p -order  cones 


In  contrast  to  the  Euclidean  (p  =  2)  norm,  which  admits  a  representation  via  scalar  product,  || a|| 2  =  (aTa)  '^2, 
the  general  p  ^  2  norm  ||  •  ||p  explicitly  requires  the  absolute  value  operator  |  •  |  in  (2).  Thus,  in  what  follows  it 
suffices  to  consider  /(-cones  in  the  positive  orthant  of  EA?+1, 

/Cf +1>  =  {?  e  M*+1  |£0>ll(la,...,^)U,  (4) 

since  in  the  context  of  problems  (1)  and  (3)  the  absolute  values  of  p- norm  operands  can  be  expressed  using  linear 
constraints.  Then,  by  a  polyhedral  approximation  of  /C^V+1)  we  understand  a  polyhedral  cone  in  M^+ 1  +Km ,  where 
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Km  >  0  may  be  generally  non-zero. 


H(tv+i) 

l*p,m 


(5) 


having  the  properties  that: 

(HI)  any  (£0 _ _  £tv)T  €  JCpN+^  can  be  extended  to  some  (£0 _ ,%N-U  1 . uk,„  )J  € 

(H2)  for  some  prescribed  s  =  e(m )  >  0,  any  (£0 . MKm)T  e  satisfies  ||(£i . f;v)||p  <  (1  +  e)^o- 


Here  m  is  the  parameter  of  the  construction  that  controls  the  approximation  accuracy  s.  Replacing  each  of  the 
/?-order  cone  constraints  in  problem  (1)  by  their  polyhedral  approximations  of  the  form  (5),  we  obtain  a  linear 
programming  approximation  of  the  pOCP  problem  (1): 


min 


/h<*)Tx  +  /<*>\ 
C<«x  +  e(« 

V  u<*>  j 


>  o. 


,(*) 


>  0. 


(6) 


Observe  that  the  projection  of  the  feasible  region  of  (6)  on  the  space  of  variables  x  lies  in  between  the  feasible  set 
of  pOCP  (1)  and  that  of  its  “e-relaxation”. 


min 


cTx 


Ax  <  b, 


C(k)x  +  ^k)\pk  <  (1  +  e)(h(fe)Tx  +  /(fc)),  k  =  1, . . . ,  K  j . 


(7) 


Thus,  problem  (6)  represents  an  e-approximation  of  pOCP  (1),  given  that  the  feasible  regions  of  problems  (1)  and 
(7)  are  “close”.  Conditions  under  which  the  feasible  sets  of  (1)  and  (7)  are  indeed  <9(e)-close  have  been  given  by 
Ben-Tal  and  Nemirovski  [9,  Proposition  4.1]  for  the  case  of  p  =  2,  and  their  argumentation  carries  over  to  the  case 
of  p  ^  2  practically  without  modifications.  Specifically,  if  we  denote  by  (pOCP)  and  (pOCPe)  the  initial  problem 
(1)  and  its  e-relaxation  (7),  respectively,  the  following  holds. 


Proposition  1  (Ben-Tal  and  Nemirovski  [9])  Assume  that  (pOCP)  is:  (i)  strictly  feasible,  i.e.,  there  exist  x  and 
r  >  0  such  that 

Ax  <  b,  | C(fc)x  +  e(A) |  <  h(fc)Tx  +  f^k)  —  r,  k=\,...,K,  (8a) 

and  (ii)  “semibounded”,  i.e.,  there  exists  R  >  0  such  that 
Ax  <  b,  ||C(Ar)x  +  ew||  <  h(i')Tx+  f{k\  k  =  \,...,K  =>  h(*)Tx  +  /(fc)  <  R,  k  =  \,...,K.  (8b) 
Then  for  every  e  >  0  such  that  y(s)  =  Re/  r  <  1,  one  has 

y(e)x  +  (1  —  y(e))  Feas  (pOCPg)  C  Feas  (pOCP)  C  Feas  (pOCPj,  (8c) 

where  Feas  (P)  denotes  the  feasible  set  of  a  problem  (P). 

Remark  1  As  noted  in  [9],  the  second  inclusion  in  (8c)  holds  trivially,  whereas  the  first  inclusion  rules  out  (under 
the  stated  conditions)  the  situations  in  which,  for  example,  the  pOCP  problem  is  infeasible  but  every  its  e-relaxation 
is  feasible. 


In  constructing  polyhedral  approximations  (5)  of  p- order  cones  we  follow  the  “lift-and-approximate”  approach 
of  Ben-Tal  and  Nemirovski  [9],  who  developed  efficient,  in  terms  of  dimensionality,  polyhedral  approximations 
for  quadratic  cones.  The  first  step  in  the  construction  procedure  consists  in  a  lifted  representation,  dubbed  by  the 
authors  “tower  of  variables”,  of  a  />-cone  in  R^+1,  as  a  nested  sequence  of  N  —  1  three-dimensional  /;-cones. 
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The  original  construction  relied  on  the  assumption  that  N  =  2d  for  some  integer  d  >  1 ,  which  was  by  no  means 
restrictive,  but  allowed  for  a  simple  structure  of  the  lifted  set,  which  could  be  visualized  as  a  symmetric  binary  tree 
of  three-dimensional  cone  inequalities  that  are  partitioned  into  d  =  log2  N  “levels”,  with  2d~l  inequalities  at  a 
level  /.  Below  we  present  a  slightly  different  notation/representation  of  the  “tower-of-variables”  lifting  technique 
that  does  not  explicitly  use  the  binary  tree  structure,  and  which  simplifies  its  practical  implementation  in  the  case 
of  general  N  2d .  Namely,  given  the  (N  +  1) -dimensional  p-cone,  consider  the  set  defined  by  intersection  of 
N  —  1  three-dimensional  ;>cones  in  R^+1  x  R^-1: 

£o  =  %2N-1,  %N+j  >  \\(%2j-l,%2j)\\p,  7  =  1, - N  —  1.  (9) 

Proposition  2  Projection  of  set  (9)  onto  the  space  of  variables  (£o.  . . . ,  %n)  coincides  with  the  set  (4).  In  other 
words,  any  i;  e  R^+1  that  satisfies  (4)  can  be  extended  to  I-  e  R^+1  x  R^_1  that  satisfies  (9),  and  any  I-  e  R^ 
satisfying  (9)  is  such  that  its  first  N  +  1  components  satisfy  (4). 

Proof:  Follows  immediately  by  expanding  the  recursion  in  (9).  □ 

Remark  2  The  chain  inequalities  (9)  can  similarly  be  organized  into  a  binary  tree,  where  the  variable  on  the  left- 
hand  side  of  p-cone  inequality  represents  a  parent  node,  and  the  two  variables  on  the  right-hand  side  are  its  child 
nodes.  Such  a  binary  tree,  however,  will  have  a  rather  non-symmetric  structure.  If,  for  example,  N  =  5,  then 
f)  =  £0  is  the  root,  or  level  3  =  [log2  5]  node,  ijj  are  level-2  nodes,  §3 .... ,  arc  level- 1  nodes,  and  £1,  £2  are 
level-0  nodes.  If,  on  the  other  hand,  N  =  2d ,  then  the  binary  tree  becomes  symmetric  and  coincides  with  that  in 
[9],  where  level  0  contains  the  nodes  £1 . £/v. 

The  second  step  of  the  procedure  is  to  construct  a  polyhedral  approximation 

=  !  (u)  e  M++"m  (u)  -  0  |  (10) 

for  each  of  the  three-dimensional  p-cones  in  (9).  Observe  that  if  approximation  (10)  of  each  of  the  three- 
dimensional  /3-cones  (9)  contains  0 ( v )  facets,  v  =  v(m),  the  total  number  of  facets  in  the  approximation  of 
the  original  (N  +  1) -dimensional  p-cone  is  O(vN),  i.e.,  it  is  linear  in  the  dimensionality  N  of  the  original  p-cone. 

Proposition3  Consider  cone  (4)  and  its  lifted  representation  (9).  If  each  of  the  three-dimensional  cones  in  (9) 
is  approximated  by  (10)  with  an  accuracy  e  >  0,  the  resulting  approximation  accuracy  s  of  the  original  cone  (4) 
satisfies 

e<  (1 +e)riog2iV1  -1. 

Proof:  The  vector  $  e  R^  must  satisfy  £0  =  &N-1,  (1  +  0£jv+7  >  II  (£27-1,  ky)!!/,,  j  =  -  l. 

Expanding  the  recursion,  we  obtain 

tP  tP  tP  tP  tP  tP 

fP  =  fP  >  ‘=2N-3  i  ^2N—2  >  S2jV-7  ,  S2jV-6  ,  $2N-5  ,  $2N-4  > 

0  S2AT-1  -  (1  +  ey  ^  (1  +  €y  -  (1  +  €)2P  ^  (]  +  €yP  (l  +  €)2p  +  (1  +  eyP 
tP  tP 

> _ h _ +  + _ In _ 

"  (1  +  e)Pk  1  '  (1  +  e)pkN  ’ 

where  k,-  is  the  number  of  “levels”  in  the  “tower  of  variables”  on  the  way  from  ^2N-i  to  £;  ■  It  is  straightforward  to 
check  that  k,  e  { flog2  N]  -  1,  flog2  N] },  whence  (1  +  e)l"log2  >  ||(£i, . . . ,  £jv)||p.  □ 

(3) 

When  p  =  1  or  p  =  00,  the  cone  K,p  is  already  polyhedral;  in  the  case  of  p  =  2,  the  problem  of  constructing  a 

(3) 

polyhedral  approximation  of  the  second-order  cone  /C2  was  also  addressed  by  Ben-Tal  and  Nemirovski  [9],  who 
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(3) 

proposed  the  following  lifted  polyhedral  approximation  of  K,2  , 

mo>£i,  (Ha) 

v0>^  (lib) 

Ui  =  cos(-qT)M!--i  +  sin(^T)  Vi-i,  i  =  (lie) 

f/  >  |  —  sin (^ry)  iij-i  +  cos (^i)  i  =  (lid) 

Um  —  Vm  —  tan  (^r+r)  um,  (11s) 

0  <  Ui,  Vi,  i  =  0, ...  ,m.  (Ilf) 

Remarkably,  the  accuracy  of  the  polyhedral  approximation  ( 1 1 )  is  exponentially  small  in  m:  e(m)  =  O  ( 4-m ) .  The 


construction  is  based  on  an  elegant  geometric  argument  that  utilizes  a  well-known  elementary  fact  that  rotation  of 
a  vector  in  M2  is  an  affine  transformation  that  preserves  the  Euclidean  norm  (2-norm)  and  that  the  parameters  of 
this  affine  transform  depend  only  on  the  angle  of  rotation.  An  approach  to  constructing  a  framework  of  polyhedral 
relations  that  generalizes  inductive  constructions  of  extended  formulations  via  projections,  such  as  the  polyhedral 
approximation  (11)  has  been  introduced  by  Kaibel  and  Pashkovich  [13], 

(3) 

Unfortunately,  the  lifted  polyhedral  approximation  (11)  of  the  second-order  cone  K. 2  does  not  seem  to  be  ex- 
tandable  to  general  /border  cones  Kp  with  p  e  (1,2)  U  (2, 00).  Therefore,  we  employ  a  “gradient”  approx- 
imation  of  K,p  using  circumscribed  planes.  Given  the  parameter  of  construction  m  e  N,  let  us  call  function 
< pn ,  :  [0,  m]  1 — [0,  jt/2]  an  approximation  function  if  it  is  continuous  and  strictly  increasing  on  [0,  m],  and,  more¬ 
over,  satisfies 

A (fm  =  max  {(find  +  1)  -  <pm(i)}  -»•  0,  m  00. 

i  —  1 

Then,  for  the  following  parametrization  of  the  p-cone  surface  in  M2 


£1  =  £0 


cos  9 


( cos p  9  +  sinp  9)1!p 


£2  =  £0 


sin  ( 


(cos^7  9  +  sin^  9)1!p 


£0  >  0,  9  e  [0,  f  ]. 


(12) 


where  9  is  the  polar  angle,  any  given  approximation  function  <pm  generates  a  gradient  approximation  of  K, 


(3) 

p 


Cw  =  {?el+ 


£o  —  tXpj  [( Pm\  £l  T  d p.i  \(Pm\  £2*  l  —  0 , ...  ,171  } , 


(13a) 


where 


(cosp  <pm(i)  +  sinp  (pm(i))1,p  1 


fcOSp  '  (fim(i)\ 
ysinp_1  (pm(i) )  ' 


i  =  0, . . . ,  m. 


(13b) 


The  values  ( pm(i )  in  (13)  represent  the  polar  angles  at  which  the  planes  £o  =  o/pjt; t  +  are  tangent  to  the 

(3)  (3) 

p-conc  K,p  .  In  such  a  way,  the  properties  of  the  polyhedral  approximation  (13)  of  the  /i-coue  K,p  are  determined 

by  the  values  of  (pm  at  integer  values  {0, . . . ,  m}  of  its  argument;  nevertheless,  the  computability  properties  of 

< pm(t )  for  arbitrary  values  t  e  [0, in]  are  also  of  major  importance,  as  will  be  shown  in  the  next  section.  The 

following  proposition  establishes  the  quality  of  the  gradient  polyhedral  approximation  (13),  and  is  a  generalization 

of  a  similar  result  established  for  a  special  choice  of  <pm  in  [15]. 


(3) 

Proposition  4  For  large  enough  values  of  m  e  N,  the  polyhedral  set  ’Hp’m((pm)  defined  by  the  gradient  approx¬ 
imation  (13)  with  approximation  function  i pm  satisfies  properties  (H1)-(H2).  Specifically,  if  the  approximation 
function  is  such  that  for  some  r  >  0 

A cpm  =  0(m~r),  m  »  1, 

(3)  (3)  ('3't 

then  for  any  i-  e  K/p  one  has  I-  e  Hp,m,  and  any  i-  e  FLp/m  satisfies  ||  (^1,^2)  Up  <  (1  +  e(w))£ 0,  where  the 
approximation  accuracy  e(m)  is  polynomially  small  in  m: 

e(m)  =  O  {jn~r mmf ^,2}) ,  m  »  1. 
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Remark  3  One  possible  choice  of  < pm  is  <pm  ( t )  =  £-t,  which  yields  a  “uniform”  gradient  approximation  of  the 
p- cone,  i.e.,  a  gradient  approximation  (13)  where  the  circumscribed  planes  are  spaced  “uniformly”  with  respect 
to  the  polar  angle  6 ,  and  are  tangent  to  the  /5-cone  at  the  polar  angles  6,  =  If  p  =  2,  the  uniform 

approximation  can  be  seen  as  “optimal”,  since  it  has  the  same  accuracy  at  each  sector  [fjz,  and  thus 

requires  the  smallest  number  of  facets  to  achieve  a  given  approximation  accuracy.  In  the  case  of  p  ^  2,  however, 
the  accuracy  of  the  uniform  gradient  approximation  varies  from  sector  to  sector.  Thus,  it  may  be  of  interest  to 
construct  an  approximation  function  <pm  that  results  in  a  constant  accuracy  at  each  sector  [<pm(i),  <pm(i  +  1)]  of 
/5-cone,  thereby  minimizing  the  number  of  facets  needed  to  achieve  the  desired  accuracy.  On  the  other  hand,  if 
the  structure  of  the  problem  is  such  that  an  optimal  solution  is  known  to  be  located  in  a  certain  part  of  the  cone, 
it  might  be  beneficial  to  construct  an  approximation  that  is  more  accurate  within  this  particular  region  and  less 
accurate  outside  of  it.  These  considerations  provide  an  intuition  on  how  a  careful  choice  of  cpm  may  reduce  the  size 
of  the  problem  in  question.  In  this  work,  however,  we  do  not  discuss  the  question  of  constructing  an  “optimal” 
approximation,  instead  focusing  on  the  issues  related  to  solving  the  polyhedral  approximations  of  pOCP  problems. 

For  p  =  2  and  a  given  approximation  accuracy,  the  lifted  polyhedral  approximation  (11)  due  to  Ben-Tal  and 
Nemirovski  [9]  is  superior  to  the  gradient  polyhedral  approximation  (13)  in  terms  of  dimensionality.  However, 
computational  studies  [11,  15]  indicated  that  solving  polyhedral  approximations,  either  lifted  or  gradient,  of  SOCP 
problems  was  computationally  inefficient  comparing  to  “native”  SOCP  solution  techniques,  such  as  self-dual 
interior-point  methods. 

At  the  same  time,  the  computational  efficiency  of  the  polyhedral  approximation  approach  can  be  substantially  im¬ 
proved  by  employing  decomposition  methods  that  exploit  the  specific  structure  of  polyhedral  approximations  in 
(13),  whereby  the  polyhedral  approximation  approach  becomes  competitive  with  SOCP-based  solution  methods 
for  pOCP  problems  with  p  ^  2.  This  was  demonstrated  for  a  special  case  of  the  uniform  gradient  polyhedral  ap¬ 
proximation  [15].  In  the  next  section  we  show  that  analogous  computational  efficiencies  can  be  achieved  for  more 
general  gradient  polyhedral  approximations  of  pOCP  problems,  as  well  as  for  the  lifted  polyhedral  approximation 
of  SOCP  problems. 


3  Cutting  plane  methods  for  polyhedral  approximations  of  SOCP  and 
pOCP  problems 

Computationally  efficient  methods  for  solving  polyhedral  approximations  (5)  of  SOCP  and  pOCP  problems  can 
be  constructed  by  taking  advantage  of  (i)  the  special  structure  of  the  problem  induced  by  the  “tower-of-variables” 
representation  of  high-dimensional  cones  as  an  intersection  of  three-dimensional  ones  in  a  lifted  space,  and  (ii)  the 
special  structures  of  polyhedral  approximations  of  three-dimensional  quadratic  or  /5-order  cones. 

With  respect  to  (i),  a  cutting  plane  method  that,  given  a  polyhedral  approximation  for  3D  cones,  utilizes  the  struc¬ 
ture  of  the  “tower-of-variables”  reformulation  in  the  approximating  problems  (5),  was  proposed  in  [15].  This 
method  is  briefly  described  in  Section  3.1  below,  since  it  is  necessary  in  the  context  of  (ii),  namely,  for  exploiting 
the  special  properties  of  gradient  and  lifted  polyhedral  approximations  of  3D  cones  for  fast  cut  generation.  In 
particular,  the  discussion  that  follows  in  Sections  3.2  and  3.3  demonstrates  that,  despite  the  differences  in  con¬ 
struction  and  properties,  the  lifted  Ben-Tal-Nemirovski’s  approximation  (11)  of  quadratic  cones  and  the  gradient 
approximation  (13)  of  /5-cones  offer  the  same  computational  efficiency  for  cut  generation. 


3.1  A  cutting  plane  procedure  for  polyhedral  approximations  of  pOCP  problems 

The  cutting  plane  algorithm  described  here  is  applicable  to  reformulations  of  pOCP  problems  obtained  using  the 
“tower-of-variables”  lifting  technique  (9).  Assuming  for  simplicity  that  problem  (1)  contains  only  one  p- cone 
constraint  (K  =  1)  of  dimension  N  +  1,  the  corresponding  reformulation  of  (1)  is  obtained  by  lifting  the  /5-cone 
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constraint  using  the  “tower-of-variables”  method  as 


min  cTx  ( 14a) 

s.  t.  Ax  <  b  (14b) 

WN+j  >\\{w2j-i,W2j)\\p,  7  =  1 . N-  1,  (14c) 

Wj  >  |(Cx  +  e) j  | ,  7  =  1 . N,  (14d) 

w2N-i  =  hTx  +  /,  (14e) 

where  w  e  E2A/  ~  1 .  Each  of  the  three-dimensional  p- order  cones  (14c)  is  subsequently  replaced  by  its  polyhedral 
approximation  (10),  which  yields  the  following  polyhedral  approximation  of  pOCP  (1): 

min  cTx  (15a) 

s.t.  H<3)  >  0,  j  =  1, ...  ,7V  -  1,  (15b) 

u/eR+m,  (15c) 

(14b),  (14d),  (14e),  (15d) 


where  the  vectors  w /  stand  for  the  triplets  w /  =  {wn+j ,  wij  )T.  Constructed  in  such  a  way  polyhedral 

approximation  of  the  pOCP  problem  (1)  possesses  a  special  structure  that  can  be  exploited  for  solving  the  LP 
problem  (15)  efficiently.  In  particular,  the  following  cutting  plane  representation  for  (15)  was  presented  [15]: 


min  cTx  (16a) 

S.t.  WN+j  >  (0 . 0,  lt>2/-l,  W2j)  Hi,  ieVp,m,j  =  l . N-  1,  (16b) 

(14b),  (14d),  (14e),  (16c) 

where  Vp,m  is  the  set  of  vertices  it/  of  the  polyhedron 

{*>0  HTpmn  <  )  [  ,  (17) 


(3) 

and  the  matrix  Hp,m  is  obtained  by  augmenting  the  approximation  matrix  H p  m  with  two  extra  rows  (0, 1, 0  •  •  •  0), 
(0,  0, 1 , 0  •  •  •  0),  where  1  ’s  correspond  to  the  variables  u>2/-t  and  W2j  : 

(tt(3)  \ 

Llp,m  \ 

010---0  . 

001---0/ 

Constraints  (16b)  are  then  generated  via  an  iterative  procedure.  Assuming  that  problem  (16)  is  bounded,  consider 
the  master  problem  in  the  form 

min  cTx  (18a) 

s.  t.  WN+j  >  £j,i  u>2j-\  +  xpi  w2j,  i  =  1 —  ,  Cj ,  7  =  1 . N  -  1,  (18b) 

(14b),  (14d),  (14e),  (18c) 

where  and  r /,/  stand  for  the  components  irv—  i  and  ifv  of  the  vector  it  e  Eu,  and  Cj  is  the  number  of  constraints 
generated  during  preceding  iterations.  Let  (x*,  w*)  e  M”  1  2N  2  be  an  optimal  solution  of  the  master  (note  that  if 
(18)  is  infeasible,  then  (16)  is  infeasible  too,  and  the  procedure  stops).  For  each  j  =  1, . . . ,  N  —  1,  the  following 
LP  problem  is  solved: 


t*  :=  max 


(0,  ...,0, 


it>. 


2j-l’w2j 


i)n 


Km* 


it  >  0 


7 


(19) 
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and  it  is  checked  whether  the  condition 


w 


N+j 


.  >  £* 
I  —  >7 


W2j- lnv 


*0) 


+ 


*U) 


(20) 


holds,  where  it**7)  is  an  optimal  solution  of  (19).  If  it  does  not,  a  new  constraint  (18b)  is  added  for  the  variable 
u>N+j  by  incrementing  the  corresponding  counter  of  constraints  in  (18b):  Cj  '.=  C/  +  1,  and  setting  gjj'  =  Jtv_\, 
Tjj>  =  7T*<'/)  for  i'  =  Cj.  Upon  checking  condition  (20)  for  all  variables  u>N+j,  j  =  1, . . . ,  N  —  1,  in  (18),  the 
master  problem  (18)  is  augmented  with  new  constraints  and  is  solved  again.  If  (20)  holds  for  all  variables  wn+j, 
and  thus  no  new  cuts  are  generated  during  an  iteration,  the  current  solution  x* ,  w*  of  the  master  problem  is  optimal 
for  the  original  LP  approximation  problem  (16).  In  such  a  way,  the  described  cutting  plane  procedure  obtains  an 
optimal  solution,  if  it  exists,  of  the  original  LP  approximation  problem  (16)  after  a  finite  number  of  iterations,  with, 
perhaps,  some  anticycling  scheme  employed. 


3.2  Fast  cut  generation  for  gradient  approximations  of  p-order  cones 

The  cutting -plane  scheme  of  Section  3.1  exploits  the  properties  of  the  “tower-of-variables”  representation  (9)  of 
high-dimensional  /7-cones  as  a  nested  sequence  of  3D  /7-cones  to  facilitate  solving  (large-scale)  polyhedral  approx¬ 
imations  (5).  In  this  section  we  show  that  if  the  gradient  polyhedral  approximation  (13)  is  used  for  approximating 
three-dimensional  /?-cones  in  (15),  the  structure  of  this  approximation  can  be  utilized  to  achieve  significant  com¬ 
putational  savings,  provided  that  the  approximation  function  cpm  of  the  gradient  polyhedral  approximation  satisfies 
a  certain  computability  condition. 


Propositions  Consider  a  polyhedral  approximation  (6)  of  pOCP  problem  (1),  obtained  by  reformulating  each 
of  the  K  p-cones  in  (1)  using  the  “tower-of-variables”  representation  (9)  and  then  applying  the  gradient  poly¬ 
hedral  approximation  (13)  with  parameter  of  construction  m  and  approximation  function  < pm.  Then,  if  <p~]  is 
computable  in  (9(1)  time,  during  an  iteration  of  the  cutting  plane  scheme  of  Section  3.1  new  cuts  can  be  generated 
in  0{  'ffk  Nk)  time  that  is  independent  ofm,  where  Nk  +  1  is  the  dimension  of  kth  p-cone  in  (1). 


Similarly  to  Proposition  4,  this  result  strengthens  the  statement  in  [15].  We  still  provide  its  proof  here,  since  it  is 
necessary  for  formalizing  a  subsequent  observation  in  Proposition  6. 

Proof  of  Proposition  5:  When  the  gradient  polyhedral  approximation  (13)  is  used,  the  cut-generating  problem 
(19)  can  be  formulated  as 


max 


m 


E  (ctp.ii;* 

7=0 


+  Pp,i  £2  )  71  i 


2 


E  St* 


JfTti  <\,  7t0 - ,7tm>  0,  51,52  >0 

7=0 


(21) 


where  the  constants  £*  and  ££  stand  for  the  corresponding  elements  of  the  current  optimal  solution  w*  of  the 
master  problem:  £*  =  Wy-i ,  %2  =  w2j-  Disregarding  the  trivial  case  of  ^  =  0,  we  assume  that  at  least 

one  of  these  parameters  is  positive:  +  ££  >  0.  It  is  clear  that  solving  (21)  amounts  to  finding  a  maximum 

element  of  the  set  {apjf\  +  fp,i^2}i=o,...,m ■  Namely,  if  one  has 


i*  €  argmaxi=0  m  {apJ  +  fpJ  £}, 
then  an  optimal  solution  it*  of  (21)  is  given  by 

n*  =  0,  i  €  {0, _ m}  \  /*;  jt**  =  1;  5i  =  apj*\  S2  =  fP,i* 

For  fixed  £*,t;2  >0  and  p  >  1 ,  consider  the  function 

cos-P-1 1 


g(t)  = 


+  & 


sinP  1 1 


(cos p  t  +  sinp  r)1_1/p  2  (cos^  t  +  sinp  t)l~l!f 


7  t  €  [0,  f  ] 


(22a) 


(22b) 
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with  the  derivative 


/  _  ,  sin^  1 1  cosp  1 1  (  —  £ f  \ 

<?  (0  =  (P  —  1)t -  .  p  .2-1/p  ( - 7  +  ~T_ 7  )  • 

(cos/’  ?  +  sur  ?)2  ^-PVcosf  sin?/ 

Obviously,  for  ?  e  [0,  f-]  function  g(?)  is  either  strictly  monotone  (when  one  of  £*,£2  's  zero)  or  has  a  unique 
global  maximum  at  t*  =  arctan^ /£* )•  Then,  for  a  continuous  and  strictly  increasing  approximating  function 
< pm  :  [0, 777]  [0,  j\,  the  function  g((pm(f)  is  also  either  monotone  on  [0,7??]  or  has  a  unique  maximum  at 

<PmX{  arctan(^2 /?*))•  Consequently,  if  the  inverse  (p-1  of  the  approximating  function  is  computable  in  0(1)  time, 
the  index  i  *  of  a  maximum  element  of  the  sequence 

g((Pm 0))  =  otP,i  +  ?2  Pp,i>  i  =  0 . m- 

which  defines  an  optimal  solution  (22)  of  cut-generating  problem  (21),  can  be  determined  in  0(1)  time  as 

7'*  €  argmax^-^O),  [<p~\t*)\,  \yp~l(t*)\  +  1, )},  where  t*  =  arctan^/f*).  (23) 

Given  that  each  /t-cone  constraint  of  order  pk  and  dimensionality  Njc  +  1  requires  Ag  —  1  such  operations, 
generation  of  new  cuts  in  problem  (18)  that  employs  a  gradient  polyhedral  approximation  requires  0(  Nk\ 
time.  □ 


Remark  4  An  example  of  the  approximation  function  <pm  whose  inverse  %„(t)  is  not  computable  in  a  constant 
time  for  any  given  f  e  [0,  f  \  can  be  furnished  as  <pm(z)  =  (#7  +  1  —  0{){z  —  0  +  Oj  for  i  <  z  <  1  +  1,  i  =  0, . . . ,  777, 
where  0  <  f)f)  <  9\  <  ...  <  f)m  <  y.  In  other  words,  it  is  a  piecewise  linear  function  corresponding  to 

some  arbitrarily  prescribed  polar  angles  0,-,  i  =  0 . m,  that  determine  locations  of  the  facets  of  the  polyhedral 

approximation.  It  is  easy  to  see  that  evaluation  of  for  any  given  t'  requires  determining  k  such  that 

t'  €  [Ok,  $k+ 1],  which  cannot  be  generally  done  in  a  constant  time  that  is  independent  of  m. 


In  the  case  when  £*,  ££  >  0,  the  index  i*  of  the  cut  that  may  have  to  be  added  to  the  master  is  given  by  [ip'inl(t*)\ 
or  (? *)J  +  1.  Note  that  as  m  increases  (and  the  quality  of  approximation  becomes  finer),  for  any  fixed 
£*,£2  >  0  the  facets  corresponding  to  L +  1  converge  to  a  plane  tangent  to  the  cone  at  the 

point  determined  by  the  polar  angle  6*  =  arctanf^ /£*)>  so  that  the  corresponding  cut  takes  the  form 


cos 


P~ 1  0* 


WN+j  >  W2j-\ 


(cos p  0*  +  sinp  Q*y-llp 


+  W2j 


sin 


p- 1 


(cost7  6*  +  sin^7  0*)1_1/p 


9*  =  arctan 


w 


V 


w 


(24) 


2/-1 


In  this  case,  one  does  not  need  to  solve  the  cut-generating  LP  (19)  and  check  condition  (20)  in  order  to  add  the 
corresponding  cut.  Namely,  for  a  current  solution  w*  of  the  master,  cut  (24)  is  added  to  the  master  if  the  condition 


IK'-l-Ol^a  +  ^W+y  (25) 

is  not  satisfied  for  the  respective  j  =  1 . N  —  1 .  The  following  proposition  formalizes  this  procedure. 


Proposition  6  Given  an  instance  of  pOCP  problem  (1)  that  satisfies  the  conditions  of  Proposition  1,  consider  a 
cutting  plane  scheme  for  constructing  an  approximate  solution  of  its  lifted  reformulation  (14),  where  the  master 
problem  has  the  form  (18),  and  for  a  given  solution  x* ,  w*  of  the  master,  cuts  of  the  form  (24)  are  added  if  condition 
(25)  is  not  satisfied  for  a  specific  j .  Assuming  that  (18)  is  bounded,  this  cutting  plane  procedure  terminates  after  a 
finite  number  of  iterations  for  any  given  s  >  0,  with,  perhaps,  some  anti-cycling  scheme  applied.  In  particular,  the 
algorithm  is  guaranteed  to  generate  at  most  0(e_1)  cutting  planes,  and  in  the  special  case  of  p  =  2  the  described 
cutting  plane  algorithm  is  guaranteed  to  stop  after  at  most  0(£~°  5)  iterations. 


Proof:  Given  s  >  0,  let  e  be  the  corresponding  approximation  accuracy  of  3D  /;-cones  in  ( 14)  due  to  Proposi¬ 
tion  3: 

e  =  (1  +£)1/te2Arl  -  1  =  (log 2  Al_1e+  0(e2),  (26) 
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and  Wft+j,  iv 2  j  . , .  and  Wy  be  the  elements  of  the  current  solution  of  the  master.  We  will  show  that  there  exists 
some  3e  such  that  if  9*  is  located  at  an  angular  distance  closer  than  8e  from  an  existing  cut,  then  (24)  implies  (25), 
i.e.,  no  new  cut  can  be  added  within  8e  from  an  existing  one.  By  (24),  for  any  existing  cut  at  polar  angle  9k  the 
solution  of  the  master  should  satisfy 


v’n+j  >  wlj_ j 


cos^7  1 6k  *  sin^7  1  6k  »  *  ,, 

- - 7TT  +  w2  j - - - 7ZT  =  l(w2j-l’w2j)\\p 

(cos p  6k  +  sinp  9k)1  p  (cost7  6k  +  sinp  9k)1  p 


cos  t ij 


cosp  1  6k 


sin  6* 


sin^7-1  6k 


(cos P  9*  +  sinp  9* ) p  (cos p  9k  +  sinp  9k)1  p  (cost7  9*  +  sin^7  9* ) p  (cos p  9k  +  sin^7  9k)1  p 


w 


where  9*  =  arctan  — 27  .  Let  9*  =  9k  +  8,  in  which  case 
J  in*  J  K 


W2j-l 


W*N+j  >  |  (w2j-l’w2j) 


cos  8  ( cosp  9k  +  sinp  9k )  +  sin  8  (sinp  1  9k  cos  9k  —  cosp  1  9k  sin  9k ) 


(cos P(9k  +  8)  +  sin p(9k  +  5)) p  (cost7  9k  +  sin^7  9k)1  p 
(wy_1,Wy)\\p(A(9k,8)cos8  +  B(9k,8)  sin 3), 


(27) 


where  we  denote 


A(9k,8) 


( cost7  9k  +  sin^7  9k) p 
(cosp(9k  +  8)  +  sinp  (9k  +  8))p 


B(9k,8 ) 


sin^7  1  9k  cos  9k  —  cost7  1  9k  sin  9k 


(cos p(9k  +  8)  +  sin p(9k  +  8 )) p  (cost7  9k  +  sinp  9k  ) 


l- 


P 


As  | <5 1  approaches  zero,  the  right-hand  side  in  (27)  converges  uniformly  to  ||  w;2/)||/?-  Namely, 

min#  ||  (cos  6 ,  sin  0)  ||p  =  const  >  0,  then 


|  A(0£,  <5)  cos<5  +  B(6k,  8)  sin  8  —  l|  <  \B(6/C,8)\  sin  |<5|  +  A(9k,8)(  1  —  cos S)  +  | A (0^,8)  —  1| 

<  7-  sin  |3|  +  2_(1  -  cos  8)  +  2-  ( cosp  9k  +  sin^7  9k)  p  -  ( cos p(9k  +  3)  +  sin P(9k  +  8 )) 

/Cq  A.o  Ao 

-  4  sinl3|  +  2-(l  -cos<5)  +  ~^p\8\ 

Kn  Aq  A0 


Ki  |3|, 


2n 

3 

7T2 

3 

2  ^  /  2 it  n2  \ 

8 

<  - 

-  vp 

Ao 

tt/2 

+  4W0 

TZ  2 

-  l^  +  4K~J 

7f/2 

let  K0  = 


_L 

p 


where  Lagrange’s  mean  value  theorem  for  the  function  f(t)  =  ||(sinf,cos?)||j,  was  utilized,  along  with  the  well 
known  facts  that  sin  \  t |  <  |t  |  and  1  —  cos  t  <t2 . 

Then,  for  any  e  >  0  there  exists  8e  =  -^  j ^  such  that  for  any  9k  and  any  |3|  <  8€  condition  (24)  implies  (25)  by 
wN+j  —  (1  —  ^i|3|)||(tf27_ii  w2j)Wp  —  !tpill(u;2y'-i>  wy)\\p-  Hence,  no  two  cuts  can  be  located  closer  than  at 
an  angular  distance  of  8e ,  whereby  no  more  than  [" +  1  =  0(e~l)  cuts  can  be  generated.  A  stronger  result 
holds  for  p  =  2,  indeed,  observe  that  in  this  case  (27)  can  be  rewritten  as 


vo *N+j  >  Wy_  \  cos  9k  +  Wy  sin  9k 

=  I  (wy-u  V0y_x)  1 2(cos  9*  cos  9k  +  sin  9*  sin  9k)  =  ||  (wy_ v  Wy_ #  ||  2  cos  8. 


(28) 


Again,  in  order  for  (28)  to  imply  (25),  one  has  to  require  that  cos  3  >  y2_,  or  cos3e  =  y2_,  which  implies 
8e  =  0(e0  5).  The  statement  of  the  proposition  then  follows  immediately  from  (26).  □ 
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Remark  5  The  cutting  plane  procedure  outlined  in  Proposition  6  represents  an  exact  solution  algorithm  for  the 
lifted  pOCP  problem  (14),  and,  correspondingly,  the  original  pOCP  problem  (1),  in  the  sense  that  it  does  not 
rely  on  any  particular  form  of  polyhedral  approximation  once  an  approximate  solution  x£l  is  obtained  with  a  given 
accuracy  e  =  £i,  an  (improved)  solution  x£2  can  subsequently  be  constructed  by  setting  new  accuracy  e  =  S2  <  s\ 
and  resuming  the  cutting  plane  algorithm  (i.e.,  the  algorithm  does  not  have  to  be  restarted).  In  contrast,  the  cutting 
plane  method  of  Section  3.1  in  this  case  would  require  updating  the  algorithm  itself,  namely  changing  the  LP 
problem  (19)  that  is  used  to  generate  new  cuts.  The  0(s_1 )  iteration  complexity  of  the  described  method  in  the 
case  of  general  p  ^  2  is  comparable  to  0(s~1 )  iteration  complexity  of  first-order  methods  for  SOCP  ([6,  16],  see 
also  [5,  17]),  while  in  the  p  =  2  case  it  improves  to  0(e  °'5).  Of  course,  the  computational  cost  per  iteration 
increases,  and  in  the  worst  case  the  last  iterations  would  require  solving  LPs  with  0(e~l )  (respectively,  0(f~a'5)) 
constraints.  In  practice,  however,  the  described  method  terminates  within  a  relatively  small  number  of  iterations 
(see  Section  4.2). 


3.3  Fast  cut  generation  for  lifted  polyhedral  approximation  of  second-order  cones 

In  this  section  we  demonstrate  that  a  result  analogous  to  Proposition  5  can  be  formulated  in  the  case  of  the  lifted 
approximation  (11)  due  to  Ben-Tal  and  Nemirovski  [9],  i.e.,  such  an  approximation  also  allows  for  efficient  appli¬ 
cation  of  the  cut-generation  technique. 

In  accordance  with  the  cutting  plane  method  of  Section  3.1,  consider  the  master  problem  (18)  that  corresponds  to 
a  polyhedral  approximation  of  the  SOCP  (p  =  2)  version  of  problem  (14),  where  Ben-Tal  and  Nemirovski’s  lifted 
polyhedral  approximation  (11)  of  three-dimensional  quadratic  cones  in  the  “tower-of-variables”  is  used.  In  this 
case,  the  coefficients  gjj ,  tjj  in  (18b)  are  found  as  the  simplex  multipliers  of  the  first  two  constraints  of  the  LP 
problem 


min  z 

s.  t.  Uq  >  tt>2,_i, 


v  o  >  Wy  , 


U,v,z  >  0, 


(29a) 

(29b) 

(29c) 

i  =  1 , . . 

. ,  m, 

(29d) 

i  =  1 , . . 

. ,  m, 

(29e) 

(29f) 

(29g) 

where  Wy_  x,  Wy  are  the  components  of  the  optimal  solution  of  the  master  problem  obtained  during  the  most 
recent  iteration.  If  the  optimal  value  of  (29)  satisfies  w^+ ■  <  z*.  then  a  new  cut  of  the  form  (18b)  is  added  to  the 
master. 

It  is  important  to  note  that,  unlike  the  gradient  polyhedral  approximation  (13)  of  /r-cones,  the  lifted  approxima¬ 
tion  (11)  of  quadratic  cones  due  to  Ben-Tal  and  Nemirovski  is  constructed  recursively,  where  the  parameter  m 
represents  the  recursion  counter  and  controls  approximation  accuracy.  Intuitively,  the  process  of  constructing  this 
lifted  approximation  of  a  3D  quadratic  cone  can  be  visualized  as  a  sequence  of  “rotations”  and  “reflections”  in 
E2.  Given  a  vector  (uq,  i>o)  in  the  positive  quadrant  of  the  plane,  during  the  first  iteration  of  the  recursion  it  is 
rotated  clockwise  by  n / 4  around  the  origin  and,  if  the  rotation  puts  it  into  the  lower  half-plane,  it  is  reflected 
symmetrically  about  the  horizontal  axis,  resulting  in  vector  {u i,  iq)  that  is  again  in  the  positive  quadrant.  During 
the  second  iteration,  vector  (u\,v i)  is  rotated  clockwise  by  jt/8  and  reflected  symmetrically  about  the  horizontal 
axis  if  it  falls  into  the  lower  half-plane  due  to  the  rotation.  The  resulting  vector  is  designated  (112,  V2),  and  so  on. 

In  view  of  this,  as  the  first  step  of  constructing  a  (9(1)  solution  algorithm  for  the  dual  of  (29),  we  formally  show 
that  an  optimal  solution  of  (29)  can  be  obtained  in  O(m)  time  by  applying  the  above  recursion  procedure  to  vector 
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{w*2j_vw*2j). 

To  this  end,  let  us  denote  by  (r,- ,  a,)  the  polar  coordinates  of  the  pair  (w,  ,  i >,)  in  (29): 

n  =  Yj  ( Ui  ,Vi)=  II  (Uj ,  Vi )  II 2,  OLi  =  a{  (Ui,  Vi )  =  arg(w, ,  v,-)  =  arc  tan  (c; /u,). 

In  what  follows,  we  will  use  notations  (m, ,  v,-)  and  (r/,a/)  interchangeably.  Since  one  can  always  put  z  =  um 
in  (29),  the  discussion  of  feasibility  and  optimality  in  (29)  reduces  to  that  for  the  pair  of  vectors  (u,  v)  = 
(uq,  . . . ,  um\  vo _ _  vm).  First,  let  us  make  two  observations. 


Observation  1  If(  u,  v)  is  feasible  for  (29),  then  a,  < 


JT 

2i+1 


for  i  =  0, . . . ,  m. 


Proof:  Indeed,  if  for  some  in  one  has  o',,,  >  .  ,  , , 

0  2'o+1 

yields  a  contradiction  with  (29g)  that  requires  am  < 


then  by  (29d)-(29e)  a,0+i  > 

it 

2«?+i  * 


.  — ,  which,  by  continuation, 

2'o+2  3 


□ 


Observation  2  Given  a  feasible  (u,v)  and  i  o  G  {1  a  feasible  (u,v)  can  be  constructed  that  satisfies 

( Ui,Vj )  =  ( Ui,Vi )  for  i  <  io  -  1  and  In. on)  =  (n~i,  |a,-i  -  for  1  >  'o- 

~  ~  ^ 

Proof:  For  this,  we  only  need  to  verify  that  (29g)  is  satisfied  for  (u,  v).  Due  to  Observation  1,  one  has  . 

„  JT  „  JT  ~  JT 

Thus,  by  construction  a,0  <  +1 ,  a,0+i  <  +2 ,  ....  am  <  ^m+l ,  which  is  equivalent  to  (29g).  □ 

With  this  in  mind  we  can  construct  an  optimal  solution  to  the  problem  under  consideration. 

Lemma  1  An  optimal  solution  for  the  problem  (29)  can  be  obtained  by  setting  constraints  (29b)-(29f)  to  equal¬ 
ities,  or,  in  other  words,  r^  =  || i >  u;2/)ll’  “o  =  arg(Mo-^o),  and  r*  =  r*_v  a*  =  for 

i  =  l, ...  ,m. 


Proof:  For  a  feasible  (u,  v),  let  k  be  the  largest  of  those  i  e  { 1 . m}  for  which  (29e)  is  a  strict  inequality  i.e.,  k 

is  such  that  constraint  (29e)  is  non-binding  for  i  =  k  and  binding  for  i  =  k  +  1 . m.  Following  Observation  2 

with  i o  =  k,  define  a  feasible  (u,  v)  which  satisfies 


(ui,Vi)  =  ( Ui,Vj ),  i  =  0 . k  -  1, 

jt 


(rk,ak)  = 


fk- 1- 

n- 1, 


“fe-t 


on- 1  - 


2*+t 
7T 


2i  +  l 


)• 

^ ,  /  =  k  +  1 , . . . ,  m . 


(30) 


From  the  definition  of  k  and  (30)  it  follows  that  ak  =  ak  +  A,  where  A  >  0  due  to  (29e).  By  construction,  one 
has 


cos  ak 

rk  =  n c-i - — — - 

cos(a^  +  A) 


>  h- 


(31) 


Now  let  us  demonstrate  that  (u,  v)  yields  at  least  as  good  objective  value  as  (u,  v),  or  in  other  words,  um  <  um. 
Note  that  the  definition  of  k  and  (30)  immediately  imply  that 


um  =  rm  cos  am  =  rk  cos  am,  um  =  rm  cos  am  =  rk  cos  am , 


(32) 


and 


Let  us  consider  three  cases: 


Oim  — 
Oim  = 


JT 

71 

JT 

2m  +  l 

2m 

2k+2  ak 

JT 

JT 

JT 

2^+1 

2m 

2k+2  Uk 

(33) 
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71 


(a)  Assume  that  +  A  <  — — .  In  this  case  equalities  (33)  yield  the  following  expressions  for  am  and 

2 m  '  1 


71 


71 


a’n  ~  2m+1  “fe’  “m  “  2m+1 
which  upon  substitution  in  (32)  provide  that  um  >  um. 


(ak  +  A)  <am. 


(b)  Now  consider  the  case  of  ak  >  am.  Successive  application  of  the  inequality  ||a|  —  \b\\  <  \ a  —  b\  to  the 
expressions  in  (33)  yields  that  \am  —  am\  <  A,  and  consequently  am  <  am  +  A.  Thus,  from  (32)  one  has 
u m  =  rk  cos  am  >  rk  cos(am  +  A).  Upon  substituting  expression  (31)  for  rk  into  the  last  inequality,  we 
obtain 

cos  ak 

Um  >  ?>_!  — — — — —  cos(am  +  A)  =:  /(A). 

COS(Q!/fc  +  A) 


Noting  that  /'(A)  =  r*-,  cos (ak) 
Um  >  /(A)  >  /( 0)  =  um. 


sin(qfc  -  am) 
cos  2(am  +  A) 


>  0  for  ak  >  am  and  / (0)  =  um,  we  can  conclude  that 


(c)  Finally,  suppose  that  both  conditions  of  (a)  and  (b)  are  not  satisfied  i.e.,  ak  <  am  and  ak  +  A  >  ^m+1 , 
Consider,  the  ratio  of  um  and  iim  as  given  by  (32),  where  expressions  (31)  and  (30)  are  used  for  rk  and  rk, 
respectively: 

um  cos  ak  cos  am 

um  cos  a„,  cos(q^  +  A) 


The  above  assumption  and  Observation  1  imply  that  ak  <  am  and  am  <  — — r  <  ak  +  A,  whence  the  last 

2m+1 

equality  readily  yields  um/um  >  1. 


In  (a)-(c)  we  have  shown  that  for  feasible  (u,  v)  such  that  constraint  (29e)  is  binding  for  i  =  k  +  1, . . . ,  m,  we  can 
construct  a  feasible  solution  with  at  least  as  good  objective  and  constraint  (29e)  binding  for  i  =  k, . . . ,  m.  Using 
this  claim  inductively,  we  can  conclude  that  for  any  feasible  (u,  v)  one  can  construct  a  feasible  solution  for  which 
all  constraints  in  (29e)  are  satisfied  as  equalities  and  which  has  objective  at  least  as  good  as  (u,  v). 


Finally,  note  that  a  similar  argument  can  be  constructed  if  (29b)  or  (29c)  are  not  active.  Indeed,  the  case  when 
Vo  >  Wy  is  completely  analogous  to  the  case  when  (29e)  is  not  active.  Similarly,  if  h0  >  Wy_  I ,  which  essentially 
increases  the  value  of  ro  and  reduces  the  value  of  a<)  by  some  <5,  let  us  denote  as  r(j,  u'm  and  u'm  the  new  values  of  r(), 

am,  and  um  corresponding  to  this  case.  Then  we  can  observe  that  u'm  =  r'Q  cos  a'm  >  r'Q  ,  — — cos (am  —8). 

um  cos  am  sin(ao  —  8)  cos  8  —  cot  »o  sin  8 

Hence,  —  =  - - -  =  - - - -  <  1. 

u'  cos(am  —  o)  sin  a o  cos  8  +  tan  am  sin  8 


Thus,  we  can  observe  that  the  solution,  constructed  by  setting  constraints  (29b)-(29f)  to  equalities  yields  at  least 
as  good  objective  value  as  any  other  feasible  solution.  □ 


By  virtue  of  Lemma  1,  the  problem  of  finding  optimal  of  (29)  is  reduced  to  the  following:  given  cto  £  [0,  and 
m  >  1 ,  determine  am  from  the  recurrent  relations 


Ui 


jt 

2'  +  1 


i  =  1, ...  ,m. 


(34) 


Clearly,  this  can  be  done  in  O(m)  time.  Below  we  show  that  determining  am  from  recursion  (34)  requires  0(1) 
time. 


1 71  (m\ 

For  now,  let  us  assume  that  a  o  ^  ^ m+1 .  For  k  =  l, ...  ,2m.  define  set  Ak 
Observation  1,  am  e  for  any  ao- 


/  ( k  —  1  )n  kit 
\  2m+1  5  2m+1 


Note  that  by 
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Lemma  2  If  a o  G  and  am  is  given  by  (34),  then 


(k  —  1  )jr 

ym+l 
kit 
r)m  + 1 


if  k  is  even 
if  k  is  odd. 


Proof:  First,  note  that,  by  construction,  the  recursive  relation  (34)  corresponds  to  the  process  of  rotations  and 

it 

reflections  i.e.,  if  we  treat  a,-  as  a  polar  angle,  then  a,  +  i  is  obtained  by  rotating  a,-  clockwise  by  ^/  +  1  and  then,  if 
the  result  is  in  the  lower  half-plane,  reflecting  with  respect  to  the  horizontal  axis.  In  accordance  to  (34),  a  reflection 
is  performed  whenever  a,_i — yy-  <  0,  therefore  for  a  given  ao  we  can  define  the  number  of  reflections  (an) 


s  o)  =  |  ji  :  dj- 1  -  — py-  <  0||. 

Next,  note  that  if  ao.  fo  €  then  ^m)(ao)  =  ^mHfo)  and-  moreover,  for  any  i  there  exists  kj  such  that 

ai.fi  e  A Indeed,  by  the  definition  of  set  A ^  we  have  that  sign  (ao  —  j)  =  sign  (/So  —  j)  and  thus 

a.\,f  i  G  '  wdere  ki  =  k  —  2"'_1  if  k  >  2"'_1  +  1  (no  reflection)  or  k\  =  2m_1  —  k  +  1  if  k  <  2m_1  (one 
reflection).  Successively  repeating  this  argument  we  observe  that  it  holds  for  any  i . 

Hence,  we  can  define  as  the  number  of  reflections  due  to  (34)  for  «o  €  or  =  ^m\a o)  for  any 

ao  G  A^\  Let  us  show  that  if  ao  G  A^m\  then 

(k  1  )lt  .  .  r(m)  . 

a0-  ?m+1  ,  if  Ik  is  even, 

a m  ~  \  kir  i  \ 

pm+T  ~  a°'  if  is  Odd. 

Using  the  identity  \a\  =  a  sign  a,  the  recursive  representation  (34)  can  be  written  as 

m 

am  =«B,(-»(«2(i1(a0-|)-|)—-2^T)  =  ao\\Si-S,  (36) 


in  m 

8,  =  sign  (a,- 1  -  and  8  =  Y,^J+iY\*i- 


According  to  the  arguments  given  above,  Y^i=\  &i  and  S  should  be  the  same  for  all  ao  G  A^"\  Also  note  that 
XYi=i  Si  =  ±1,  and  for  all  ao  we  should  have  am  G  j^O,  ^m+l  j-  Suppose  that  Ylo=i  Si  =  1,  i.e.,  am  =  ao  — 

\(k  —  \)it  kit  i 

S.  which  is  a  linear  translation  of  the  interval  - — — , - —  .  Since  the  result  of  the  translation  should  be 

om+ *  ?m+1 J 

r  n  i  (k  —  i )n 

contained  in  0, - —  ,  we  have  that  S  =  - — — .  Similarly,  one  can  conclude  that 

|_  2m+1  J  2  m+i 

( (k  -  1  )it  ™ 

2>n  +  l  ’  ^  n  8/  1, 

8  =  \  kit  -1  (37) 

—ym+i’  if  n  *  =  -i. 


Now,  let  us  show  that 


£(m)  _  m )  _  J 


or,  in  other  words,  parity  of  A  '  alternates  with  j .  In  order  to  see  this,  consider  the  following  inductive  argument. 
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i.  Observe  that 
the  fact  that  aj 


(i) 


71 

“0-4 


=  0,  i.e.,  the  claim  holds  for  m 
for  m  =  1 . 


1 .  Indeed,  the  claim  immediately  follows  from 


ii.  Let  m  >  2  and  k  <  2m  1 ,  then 


t(m)  _  fcO«)  1  1 

’/c  ~  ?2'”-fe+l  ^  *■ 


(39) 


Indeed,  ao  €  4^ 


,  kit  ix  it 

with  k  <  2m“  implies  that  ao  <  ^m+1  <  — ,  and  hence  a\  =  —  —  ao,  or,  equivalently. 


o',  p  A(m) 

“1  /12m_1 


_fc+1  with  one  reflection  performed.  Similarly,  for  ao  e  A^_k+l  with  k  <  2m  1  we  have 


(2m  —  k)jt  it 

that  an  >  - — —  >  — ,  whence  ai  =  ao 

u  2 m+i  —  4’  1  u 

<  (m) 


4  Le-  “i  e 

<  (m) 


-2m~ 


A2'«-i-k+v  requiring no 


1  (m) 


reflections.  Note  that  both  cases  ao  e  ^2m-A:+i  anc*  a°  e  ’  result  i11  at  e  ^2'"-i-ife+i 
requiring  one  reflection,  which  means  that  =  ^m-jt+t  +  1- 

iii.  Let  77!  >  2  and  k  >  2m_1  +  1,  then 


with  the  latter 


>-(»') 

5* 


fc(m-l) 
S*_2»!-l  • 


(40) 


Similarly  to  the  above,  for  k  >  2m  1  +  1  and  ao  e  A 9" 1  it  holds  that  ao  >  — — — y —  >  — ,  meaning  that 


a  1  =  a0 
whence  fi 


7T  (m) 

—  e  Ak_2m_l  with  no  reflections.  Rewriting  (34)  as  2a;  +  i 


2221+1 


2  a,- 


it 


2a  1  and  f. 


it 


i — 1 


2i+1 


1 , 777 


2  '  +  1 

1.  Then,  observing  that 


4 

,  let 


2  a 


i+t. 

e  +m_1)  it 

2'”— 1  ’  11 


^772 _ X  ) 

e  +_2m-i  is  equivalent  to  the 


is  easy  to  see  that  for  k  >  2m  1  +  1,  the  problem  of  finding  fm-\  given  ft 0 
problem  of  determining  a„,  from  ao  €  and,  therefore,  • 

iv.  Now,  assume  that  (38)  holds  for  some  777  >  1  and  let  us  show  that  it  also  holds  for  777  +  1.  To  this  end,  consider 
the  value  of  |£jn,+1)_£('”+1)|:  jf  j  >  2m  +  \  (i.e.,  (iii)  can  be  used  for  both  j  and  j  —  1 ),  then  from  (40)  we  have 
that  |  =  | §7—2 m  ~  |  =  1.  If  j  <  2m  (i.e.,  (ii)  can  be  used  for  both  j  and  j  —  1),  then 


_  |t.(m  +  l)  _<.(m+ 1) 

I S2W1  +  I  _  y_j_  1  S2^i  +  1_ 

-(»'  +  !)  I 


.7+2 


.  By  substituting  j'  =  2m+1  —  j  +  2 


V  ’/-I  1  1 ’7— 2"'  v- 

from  (39)  it  follows  that  |^m+1)  —  | 

we  have  that  |^m+1>  —  j  =  where  j'  >  2m  +  1,  which  reduces  to  the  previous  case. 

Otherwise,  if  j  =  2m  +  1,  then  from  (39)  one  has  \^rn+r>  _  +  1)|  =  1.  Thus, 

inductively  we  observe  that  (38)  holds  for  any  m. 


Finally,  from  (i)  and  (40)  we  observe  that  =  0  for  all  m,  thus  (38)  entails  that  L+  is  even  iff  k  is  even.  □ 


Lemma  3  If  ao  = 


kit 

2^2  +  1  ’ 


then  the  recursive  relations  (34)  yield 


OCm 


0,  if  k  is  odd 


7 r 

2^2+1  ’ 


if  k  is  even. 


(41) 


it  it 

Proof:  It  is  straightforward  to  see  that  for  ao  =  —  recursion  (34)  yields  am  =  ^m+1  ■ 

kit 

am  defined  by  the  recursion  (34)  is  continuous  with  respect  to  a0.  Let  a o  =  — — ,  k  < 

2m+1 


Also  observe  that 
2m  and  consider  a 


strictly  monotone  sequence  a(J  (77)  l  ao  with  the  corresponding  sequence  at\,  ( n )  obtained  by  the  recursion  (34). 
For  sufficiently  large  n  we  have  that  a^  (n)  e  .  If  k  is  odd,  then  by  Lemma  2  we  have  that  a+(n)  = 
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,  kn 

af  00 - tt  — ►  0,  i.e.,  by  continuity  of  am  with  respect  to  cto,  such  a o  yields  a„ 

2m+l 

(k  +  1)tt  +  n  .  71 

=  2>n  +  1  -  < («)  "►  2>n+l’  1-e-  =  2^+T' 

Based  on  Lemmas  1-3  the  following  corollary  can  be  formulated. 


0.  And  if  k  is  even,  then 
□ 


Corollary  1  An  optimal  solution  of  problem  (29)  can  be  constructed  in  a  constant  0(1)  time  that  does  not  depend 
on  the  accuracy  of  approximation  induced  by  m.  Particularly,  if  ao  =  arg(ti>i,  wf)  and  r  o  =  ||(uq,  u>2)||2,  then 
optimal  value  ofum  can  be  found  as  um  =  r  o  cosam,  where 

(k  —  1 )  7T 


Wm  —  \ 


a0 

kit 

2m+l 

Tt 

2  m-\~  1 


2  w+ 


■do. 


)jt  ( (k  —  1 ) 7r  kit  1 

—  >  a°e[  2m+l  \andkl ■ 
(/:  —  1 )  7T  kn 


Q?o  G 


2^+1  ’ 2m+i 


anc/  k  is  odd , 


(42) 


<*o  =  0. 


Now,  let  us  consider  the  simplex  multipliers  of  (29)  that  yield  new  cuts.  By  Lemma  1  we  can  equivalently  rewrite 
the  problem  as 

min  um , 
s.  t.  M0  =  w2j-l 

Vo  =  Wy  , 

u 


<  =  cos  (2^1 )“/-!  + sin 
=  *«•(- sin  (^tt)  ui- 1  + cos  (2^1) 

U,  V  >  0. 


(  =  1, . 


(43a) 

(43b) 

(43c) 

m, 

(43d) 

m, 

(43  e) 

where 


5,-  =  sign  —  sin 


Tt 


Ui- 1  +  cos 


7 x 


Vi- 1 


2/+1  y  ■  1  y  2! + 1 

Note  that  for  given  icq,  u>2  these  5/  are  constants  and  coincide  with  f  defined  in  (36).  It  is  easy  to  see  that,  by 
construction,  (43)  has  only  one  feasible  point,  which  is  an  optimal  solution  for  the  initial  problem  (29).  Again,  we 
assume  that  <5,  7^  0. 

Denote  by  _y,  the  simplex  multipliers  for  constraints  (43b)  and  (43d),  and  by  tj  the  simplex  multipliers  for  con¬ 
straints  (43c)  and  (43e),  the  dual  problem  can  be  formulated  as 


max  Wy-^yo  +  uiyto 

s.  t.  yt-i  -  cos  (-^p )y/  +  <5;  sin  U  <  0, 

ti-i  -  sin  (£+i)yi  ~  Si  cos  (^p)  ti  <  0, 

y?n  —  1  > 
tm  —  h- 


i  =  1 ,  ,m, 

i  =  1 , ,m, 


(44a) 

(44b) 

(44c) 

(44d) 

(44e) 


Lemma  4  An  optimal  solution  of  (44)  can  be  found  by  setting  all  the  constraints  to  equalities,  in  which  case 
Vm  =  1 5  tm  =  0, 

/  TT  /  7 T  TT  \  \ 

i  =  1 , ,m. 


yi- 1 

tj—\ 


=  cos(^ 
■  f  77 

=  sin  — r— — 

\  2,  +  1 


+  5, 

+  5, 


7T  .  JT 

+  •  •  •  + 


2<+2 
71 


2 i+1 


+  ...  +  S, 


m—  1 


2m+l 

Tt 

2^+1 


(45) 


i  =  1 , ,m. 
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71 


Proof:  Indeed,  let  ym  =  1,  tm  =  0  and  let  us  set  all  the  constraints  to  equalities.  Then  ym-\  =  cos 


tm- 1  =  sin 


71 


2m+ 1  ’ 


2  m+ 


- .  Further,  from  the  elementary  trigonometry  we  obtain  that 


TT 

C.  .  Jt 

(  n 

cos  — 

—  om-.\tm-i  sin  —  =  cos 

2m 

rn  i  rn  i  r.jn  1 

\2m 

.  ^ 

o  71  ■  l 

<  71 

sin  — 

+  Otn  —  \tm—\  cos  —  =  sin 

— 

2m 

2  m 

V  2m 

71 


Offl—  1 


Om—  1 


2  m+ 
71 


2  m-\- 


T  )• 

O' 


Inductively  we  can  see  that  in  this  case  (45)  holds.  Finally,  by  comparing  primal  (43)  and  dual  (44)  we  observe 
that  by  complementary  slackness,  (45)  gives  an  optimal  solution  for  the  dual.  □ 

Recall  that  in  order  to  construct  a  new  cut  we  need  the  values  of  simplex  multipliers  for  constraints  (29b)  and  (29c) 
i.e.,  \'o  and  l(t-  By  Lemma  4,  one  has  yo  =  cos  y  and  to  =  sin  y,  where 

7t  0  (  Jt  (  7T  71 

y-4+8l\8+82[j6+---  +  8m-12^ 


Also  note  that  by  duality,  tLLxTo  +  =  Z*.  hence  |y  — «ol  =  arccos 

this  with  Lemma  4  and  Corollary  1  it  follows  that 

z* 


Y 


a  o  —  arccos 


a o  +  arccos  ■ 


(W2j-1’  W2j)  | 
Z* 

\{W2j-VW2j)\ 


ll  i.W2j-V  W2j)  ||2 

/  (k  —  1)jt  kn 
^  ^  1  2,n+1  ’  2m+1 

(k  —  1  )jt  kn 


- .  Now,  by  comparing 


OtQ  € 


2^+1  ’ 2m+l 


and  k  is  even, 
and  k  is  odd. 


(46) 


Finally,  observe  that  if  <5,  =  0  for  some  i,  then  both  expressions  in  (46)  can  be  converted  into  a  part  of  a  feasible  so¬ 
lution  of  the  dual  (44)  and  since  they  yield  the  same  optimal  objective  value,  any  can  be  taken  for  cut  construction. 
In  such  a  way,  we  have  shown  that  the  following  proposition  holds. 


Proposition  7  Consider  the  SOCP  version  of  problem  (1)  with  K  second-order  (pk  =  2)  cone  constraints  of  di¬ 
mension  Nk  + 1,  and  its  polyhedral  approximation  (6)  obtained  by  reformulating  each  second-order  cone  constraint 
using  the  “tower-of-variables”  representation  (9)  and  applying  Ben-Tal-Nemirovski’s  lifted  polyhedral  approxima¬ 
tion  (11)  with  parameter  of  approximation  m  to  the  resulting  Nk  —  1  three-dimensional  second-order  cones.  Then, 
during  an  iteration  of  the  cutting  plane  scheme  of  Section  3.1,  new  cuts  can  be  generated  in  a  constant  ^k) 

time  that  does  not  depend  on  m. 


Remark  6  While  the  statement  of  Proposition  7  parallels  that  of  Proposition  5  for  gradient  polyhedral  approx¬ 
imations  of  p-cones,  its  significance  with  respect  to  Ben-Tal-Nemirovski’s  lifted  polyhedral  approximation  of 
quadratic  cones  is  substantially  different,  due  to  the  fact  that  Ben-Tal-Nemirovski’s  approximation  is  essentially 
recursive  in  construction.  In  this  sense.  Proposition  7  and  Lemma  2  provide  a  “shortcut”  method  for  computing 
this  recursion  in  a  constant  time  that  does  not  depend  on  the  recursion’s  depth. 


Remark  7  It  is  well  documented  [11,  15]  that  methods  based  on  polyhedral  approximations  do  not  generally 
outperform  self-dual  interior-point  SOCP  methods.  As  such,  the  new  approximate  solution  method  for  SOCP 
problems  introduced  by  Proposition  7  is  not  expected  to  be  generally  superior  to  interior-point  or  first-order  solu¬ 
tion  approaches  for  SOCP  [5,  6,  16,  17].  Nevertheless,  the  proposed  cutting-plane  procedure  for  lifted  polyhedral 
approximations  of  SOCP  problems  can  provide  computational  advantages  in  situations  that  require  repetitive  solv¬ 
ing  of  a  SOCP  instance  with  slight  variations  in  data.  In  this  context,  the  resulting  approximating  problem  is  an 
LP  of  a  moderate  size,  and  an  extensive  body  of  literature  on  solving  such  problems  can  be  utilized,  including 
warm-start  procedures.  As  an  illustration  of  this,  in  the  next  section  we  study  mixed-integer  pOCP  (MIpOCP) 
problems  (3).  The  branch-and-bound  framework  discussed  there  relies  on  repetitive  solution  of  the  polyhedral 
approximation  of  a  continuous  relaxation  of  MIpOCP  problem  instead  of  its  exact  nonlinear  formulation,  and  can 
benefit  significantly  from  warm  start  capabilities  of  the  solvers. 
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4  Numerical  experiments 


Our  interest  in  solving  optimization  problems  with  p- order  cone  constraints  stems  from  recent  developments  in 
risk  averse  decision  making  under  uncertainty  and  stochastic  optimization.  Namely,  mathematical  programming 
problems  with  p-order  cone  constraints  arise  naturally  in  the  context  of  stochastic  optimization  models  whose 
objective  or  constraints  involve  so-called  coherent  risk  measures  [2]  of  a  special  kind.  In  this  case  study  we  focus 
on  stochastic  programming  models  of  portfolio  optimization  with  a  certain  class  of  coherent  risk  measures. 


4.1  Portfolio  optimization  with  higher  moment  coherent  risk  measures 

Higher  moment  coherent  risk  measures  Given  a  probability  space  (£2 ,  T ,  P),  let  a  random  outcome  X ,  which 
represents  a  cost  or  a  loss,  be  an  element  of  the  linear  space  Cp  (£2 ,  T .  P)  of  ^-measurable  functions  X  :  £2  i->-  R, 
where  p  >  1.  Then,  a  risk  measure  p(X )  can  be  defined  as  a  mapping  p  :  Cp  i->-  R.  In  particular,  the  higher 
moment  coherent  risk  (HMCR)  measures  [14],  which  we  focus  on  in  this  study,  have  been  defined  as  optimal 
values  of  the  following  (convex)  stochastic  programming  problem 

HMCRp,a(X)  =  min  rj  +  (1  —  a)-1  II  [X  —  rj\+  II  a  e  (0, 1),  p  >  1,  (47) 

ije  R  ” 

where  [X]  +  =  max{0,  X}  and  |  X  I)/,  =  ( E  |  T  | /; ) 1 ' p .  By  definition,  HMCR  measures  quantify  risk  in  terms  of 
higher  tail  moments  of  loss  distribution,  which  are  commonly  associated  with  “risk”.  HMCR  measures  possess  a 
number  of  notable  properties,  including  coherence  [2],  and  isotonicity  with  respect  to  the  second-order  stochastic 
dominance  (SSD),  which  allows  for  consistence  with  the  utility  theory  of  von  Neumann  and  Morgenstern  [24].  Risk 
measures  (47)  are  also  amenable  to  efficient  incorporation  in  stochastic  programming  problems,  where  outcome 
X  is  regarded  as  a  function  on  the  decision  vector  x  and  random  event  co  e  £2:  X  =  X(x,oo).  Namely,  if, 
traditionally  to  stochastic  programming,  it  is  assumed  that  the  set  £2  is  discrete  and  consists  of  N  scenarios,  £2  = 
{&>  i, . . . ,  tt;,y },  with  the  corresponding  probabilities  mi, ,  m n,  then  expressions  involving  HMCR  measures, 
e.g.,  HMCRp,q.(X(x,  w))  <  u,  can  be  implemented  via  (N  +  1) -dimensional  p-order  cone  constraints.  For  a 
detailed  discussion  of  the  properties  of  HMCR  measures,  see  [14], 


pOCP  portfolio  optimization  model  In  the  context  of  portfolio  optimization  problems,  it  is  customary  to  define 
the  cost/loss  outcome  X  as  the  negative  rate  of  return  of  the  portfolio,  X(x,  co)  =  —  r(<w)Tx,  where  x  stands  for  the 
vector  of  portfolio  weights,  and  r  =  r(o>)  is  the  uncertain  vector  of  assets’  returns.  Then,  one  may  formulate  the 
problem  of  minimizing  the  portfolio  risk  as  given  by  the  HMCR  measure,  subject  to  the  expected  return  constraint 
and  the  budget  constraint  as  follows: 

min  |  HMCRa  „(— rTx) 

xeK'j_  1 


E(rTx)  >  r,  lTx  <  1 


(48) 


where  r  is  the  prescribed  level  of  expected  return,  x  e  M”  denotes  the  no-short-selling  requirement,  and  1  = 
(1, . . . ,  1)T.  If  r(o>)  is  discretely  distributed,  P{r(<y)  =  r,}  =  tzr,,  j  =  1, . . . ,  N,  then  (48)  reduces  to  pOCP 
problem  with  a  single  p- order  cone  constraint: 


min  t]  +  (1  —  a)  1t 
s.  t.  t  >  Hwllp, 

Diag(nj1_1/p - -  m^llp)  w  +  (iq, - rAr  )Tx+l/]  >  0, 

xT(nr1r1  +  . . .  +  mNvN)  >  r, 
lTx  <  1, 
x  >  0,  w  >  0. 


where  Diag(ai, . . . , a k )  denotes  the  square  k  x  k  matrix  whose  diagonal  elements  are  equal  to  ci  \ ....  and 
off-diagonal  elements  are  zero. 
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MIpOCP  portfolio  optimization  models  In  addition  to  the  convex  portfolio  optimization  model  (48),  we  con¬ 
sider  two  mixed-integer  extensions  of  (48).  One  of  them  is  a  cardinality-constrained  portfolio  optimization  prob¬ 
lem,  which  allows  for  no  more  than  M  assets  in  the  portfolio,  where  M  is  a  given  constant: 


min  |  HMCRq;  p(— rTx) 

xeR(|_,ze{0,i}"  ' 


E(rTx)  >  r,  lTx<l,  x  <  z,  lTz  <  M 


(50) 


Similarly  to  (48),  formulation  (50)  represents  a  0-1  MIpOCP  problem  with  a  single  conic  constraint.  In  addition, 
we  consider  portfolio  optimization  with  lot-buying  constraints,  which  reflect  a  common  real-life  trading  policy  that 
assets  can  only  be  bought  in  lots  of  shares  (for  instance,  in  multiples  of  1,000  shares).  In  this  case,  the  portfolio 
allocation  problem  can  be  formulated  as  MIpOCP  with  a  p- order  cone  constraint. 


min  <  HMCRq.  „(— r  x) 

xeM'I,  z€Z'i  ' 


E(rTx)  >  r,  lTx  <1,  x  =  —  Diag(p)z 


(51) 


where  L  is  the  size  of  the  lot,  C  is  the  investment  capital  (in  dollars),  and  vector  p  e  R"  represents  the  prices  of 
assets. 

The  following  proposition  ensures  that  the  introduced  portfolio  optimization  problems  with  HMCR  measures  (48)- 
(51)  are  amenable  to  the  polyhedral  approximation  solution  approach  discussed  in  the  previous  sections. 


Proposition  8  IfpOCP  problem  (49)  is  feasible,  then  it  satisfies  the  approximation  conditions  (8)  of  Proposition  1. 
Moreover,  the  same  applies  to  continuous  relaxations  of  MIpOCP  problems  (50)  and  (51). 


Proof:  Evidently,  the  strict  feasibility  condition  (8a)  can  always  be  satisfied  by  selecting  sufficiently  large  t  and  // 
in  (49).  To  see  that  (49)  is  “semibounded"  in  the  sense  (8b),  note  that  the  only  unrestricted  variable  in  the  problem 
is  rj,  but  due  to  the  properties  of  the  optimal  solution  of  (47)  (see  [14])  it  can  be  bounded  as  |  z/|  <  max/;X{|rJx|}  < 
max/  || ry  H^.  The  same  arguments  apply  to  relaxations  of  (50)  and  (51).  □ 


Implementation  and  Scenario  Data  We  used  the  LP  and  Barrier  MIP  solvers  of  IBM  ILOG  CPLEX  12.2  to 
obtain  solutions  to  the  formulated  portfolio  optimization  problems.  All  problems  were  coded  in  C++  and  compu¬ 
tations  ran  on  a  3GHz  PC  with  4GB  RAM  in  Windows  XP  32bit  environment.  The  additional  details  of  numerical 
experiments  are  discussed  in  the  corresponding  subsections  below. 

In  both  continuous  and  discrete  portfolio  optimization  problems,  we  used  historical  data  for  n  stocks  chosen  at 
random  from  the  S&P500  index.  Namely,  returns  over  N  consequent  10-day  periods  starting  at  a  (common) 
randomized  date  were  used  to  construct  the  set  of  N  equiprobable  scenarios  {tuj  =  N ~ 1 ,  j  =  1 , N)  for  the 
stochastic  vector  r.  The  values  of  parameters  L,C,K,a,  and  r  were  set  as  follows:  L  =  100,  C  =  100,000, 
M  =  5,  a  =  0.9,  r  =  0.005. 

4.2  Cutting  plane  techniques  for  the  lifted  and  gradient  approximations  of  SOCP  prob¬ 
lems 

The  pOCP  formulation  (49)  of  portfolio  selection  model  (48)  was  used  to  evaluate  the  performance  of  polyhe¬ 
dral  approximation-based  solution  methods  discussed  in  Section  3.  Particularly,  we  were  interested  in  comparing 
the  cutting  plane  methods  for  solving  gradient  (p  =  2)  and  lifted  polyhedral  approximations  of  SOCP  prob¬ 
lems  that  were  presented  in  Sections  3.2  and  3.3,  respectively.  Recall  that  the  gradient  polyhedral  approximation, 
while  being  applicable  to  cones  of  arbitrary  order  p  e  (l,oo),  in  the  case  of  p  =  2  is  inferior  to  Ben-Tal  and 
Nemirovski’s  lifted  polyhedral  approximation  of  second-order  cones.  At  the  same  time,  the  results  of  Sections 

3.2  and  3.3  demonstrate  that,  in  the  context  of  the  cutting  plane  scheme  of  Section  3.1,  both  types  of  polyhedral 
approximations  are  amenable  to  generation  of  cutting  planes  in  a  constant  time  that  does  not  depend  on  the  ac¬ 
curacy  of  approximation.  Thus,  it  was  of  interest  to  compare  the  cutting  plane  techniques  for  gradient  and  lifted 
approximations  of  the  SOCP  (p  =  2)  version  of  portfolio  optimization  problem  (49). 
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In  particular,  four  types  of  solution  methods  were  studied.  First,  the  complete  LP  formulation  of  Ben-Tal- 
Nemirovski’s  lifted  polyhedral  approximation  of  problem  (49)  with  p  =  2  was  solved  using  CPLEX  12.2  LP 
solver  (referred  to  as  “LP-lifted”  below).  Second,  this  polyhedral  approximation  LP  was  solved  using  the  cut¬ 
ting  plane  method  of  Section  3.1  combined  with  the  fast  cut  generation  technique  of  Section  3.3  (referred  to  as 
“CG-lifted”). 

Third,  the  SOCP  version  of  (49)  was  solved  using  the  “exact”  cutting  plane  method  of  Proposition  6  (recall  that 
this  cutting  plane  method  derives  from  the  corresponding  scheme  for  gradient  polyhedral  approximation,  but  does 
not  require  a  polyhedral  approximation  problem  to  be  formulated).  This  method  is  referred  to  as  “CG-exact”. 

Lastly,  we  solved  a  gradient  polyhedral  approximation  of  the  SOCP  version  of  (49)  using  the  cutting  plane  method 
of  Section  3.1  with  the  fast  cut-generation  scheme  of  Section  3.2.  The  gradient  polyhedral  approximation  was, 
however,  “optimized”  in  this  case  to  reduce  the  number  of  approximating  facets  as  described  below,  and  is  referred 
to  as  “CG-grad-opt”. 

Recall  that  Proposition  3  furnishes  an  expression  for  the  approximation  accuracy  e  of  (N  4- 1) -dimensional  p-cone 
provided  that  each  of  the  three-dimensional  /;-cones  is  approximated  with  the  same  accuracy  e.  It  can  be  shown 
(see  [11])  that  in  the  case  of  the  lifted  approximation  technique  [9]  applied  to  second-order  cones,  the  size  of 
polyhedral  approximation  can  be  reduced  without  sacrificing  its  accuracy  s  by  properly  selecting  the  accuracies  e, 
of  3D  cone  approximations  at  each  level  i  of  the  “tower-of-variables”.  This  approach  can  also  be  utilized  in  the 
case  of  lifting  procedure  (9)  for  /> -cones, 

to  =  %2N-l,  %N+j  >  IK&y-l.&yOllp.  7  =  1. - N-l. 

Particularly,  by  introducing  approximation  accuracies  for  3D  /;-cones  at  each  “level”  as  C  \ ,  e2>  Q,  where  i  = 
[log2  /V],  one  can  observe  that 

tP  hP  tP  tP 

tP  —  tP  >  $2N-3  i  ?2N—2  >  _ S2JV-7 _ . _ ?2N-6 _ 

o  -  Sin- t  -  (1  +  €i)p  (1  +  ei)P  ~  (!  +  €l)/»(l  +  €2)p  (l+ei)P(l+€2)P 

tp  tP  tP  tP 

, _ ?2N-5 _ | _ $2N-4 _  >  >  _ SI _ ,  , _ £ N _ 

(l+el)P(\  +  e2)p  (1  +  €{)P(\  +  €2)P  -  -  n?il(l  +  €«)*  nf=l(l  +€<)'’ 

where  once  again  kj  e  {[Tog2  AT|  —  1,  [log2  A]}  is  the  number  of  “levels”  in  the  “tower  of  variables”  on  the 
way  from  %2N-\  to  .  Then,  the  total  number  of  approximation  facets  can  be  reduced  by  solving  the  following 
problem: 


min 

m,-sN+ 


1  +  s  >  n  (i + g(w,i 


(52) 


where,  for  a  given  i,  7«,  is  the  number  of  facets  in  polyhedral  approximation  of  a  3D  /7-cone  at  “level”  i, 
€i  =  e,- (mi)  is  the  main  term  of  the  corresponding  approximation  accuracy,  and  (p  is  the  number  of  3D  p- 
cones  thusly  approximated.  The  objective  of  (52)  represents  the  total  number  of  approximation  facets,  while  the 
constraint  ensures  that  the  desired  approximation  accuracy  s  of  the  multidimensional  /7-conc  is  achieved.  A  fea¬ 
sible  solution  to  (52)  can  be  obtained  analytically  by  solving  its  continuous  relaxation  with  relaxed  constraint 
12i  =  i  —  ln(l  +  e),  and  then  taking  m,  =  where  m*  is  the  solution  of  the  relaxed  problem.  This 

procedure  resulted  in,  on  average,  a  30%  reduction  in  the  number  of  approximating  facets  for  the  uniform  gradient 
polyhedral  approximation. 

The  results  are  summarized  in  Table  1,  where  for  each  combination  of  the  number  of  assets  «,  number  of  scenarios 
N ,  and  approximation  accuracy  e,  the  running  times  are  averaged  over  20  instances.  It  has  been  noted  that  for 
the  linear  programming  problems  resulting  from  the  lifted  approximation,  CPLEX  Dual  Simplex  solver  performed 
better  on  smaller  problem  instances,  while  CPLEX  Barrier  solver  was  superior  on  larger  instances.  Thus,  we  used 
the  Barrier  solver  for  all  instances  except  for  the  two  smaller  problem  sizes  (the  first  six  rows  in  Table  1).  At  the 
same  time,  for  the  cut-generation  approaches  we  used  CPLEX  Dual  Simplex  solver  (selected  by  default). 
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n,  N 

£ 

LP-lifted 

CG-lifted 

CG-exact 

CG-grad-opt 

50,  500 

10~2 

0.43 

0.12 

0.11 

0.10 

10“4 

0.63 

0.18 

0.17 

0.14 

10“8 

2.77 

0.31 

0.32 

0.32 

150,  1500 

10“2 

1.83 

0.96 

0.98 

0.89 

10“4 

3.85 

1.24 

1.18 

1.09 

10“8 

16.29 

1.67 

1.65 

1.64 

150,  3000 

10-2 

37.24 

1.66 

1.29 

1.98 

10“4 

96.39 

5.80 

5.03 

5.52 

10“8 

296.20 

15.11 

15.63 

15.55 

200,  5000 

10-2 

151.91 

9.31 

10.20 

7.46 

10“4 

230.21 

23.49 

22.76 

22.87 

10“8 

791.41 

48.30 

47.48 

47.08 

200,  10000 

10-2 

320.80 

17.93 

18.52 

17.26 

10“4 

624.63 

45.96 

46.56 

45.09 

10“8 

— 

97.13 

96.23 

96.97 

200,  20000 

10-2 

677.14 

31.56 

31.15 

30.21 

10“4 

898.74 

85.95 

86.43 

84.12 

10“8 

*  *  * 

195.99 

196.20 

195.36 

Table  1:  Average  running  time  (in  seconds)  for  solving  portfolio  optimization  problem  (48) — (49)  with  p  =  2,  where  n,  N , 
and  e  denote  the  number  of  assets,  the  number  of  scenarios  (dimension  of  the  cone),  and  the  approximation  accuracy  of  the 
cone  constraint,  respectively.  “LP-lifted”  corresponds  to  solving  the  full  LP  resulting  from  the  lifted  polyhedral  approximation 
due  to  [9],  “CG-lifted”  -  solving  this  LP  using  cut  generation  technique  of  Section  3.3,  “CG-exact”  -  solving  SOCP  problem 
using  the  “exact”  cutting  plane  method  of  Proposition  6,  and  “CG-grad-opt”  -  solving  LP  resulting  from  gradient  polyhedral 
approximation  with  reduced  number  of  facets  due  to  (52)  using  cut  generation  of  Section  3.2.  All  running  times  are  averaged 
over  20  instances.  Symbol  “ — ”  indicates  cases  when  computations  exceeded  1  hour  time  limit,  while  “*  *  *”  indicates  cases 
for  which  the  solver  returned  “Out  of  memory”  error. 


It  follows  from  Table  1  that  the  cutting  plane  technique  of  Sections  3.1  and  3.3  for  solving  Ben-Tal-Nemirovski’s 
lifted  approximations  of  SOCP  problems  (“CG-lifted”)  provides  significant  computational  improvements  over 
solving  the  “complete”  LP  formulation  of  such  approximations  (“LP-lifted”).  This  is  consistent  with  the  cor¬ 
responding  findings  reported  in  [15]  for  uniform  gradient  polyhedral  approximations  of  pOCP  problems.  It  is 
also  worth  noting  that  the  performance  of  the  cutting  plane  method  of  Section  3.1  in  combination  with  fast  cut 
generation  of  Section  3.3  (“CG-lifted”)  is  on  par  with  that  of  the  “exact”  cutting  plane  method  of  Proposition  6 
(“CG-exact”).  However,  the  cutting  plane  method  of  Section  3.1  and  Section  3.3  for  gradient  polyhedral  approxi¬ 
mations  with  reduced  number  of  facets  (“CG-grad-opt”)  generally  works  slightly  faster  than  the  other  two  cutting 
plane  methods,  though  the  observed  improvement  is  insignificant.  Finally,  we  note  that  relatively  few  iterations 
of  the  cutting  plane  methods  were  required  to  reach  optimality  in  the  corresponding  problems;  for  instance,  in 
the  case  of  the  exact  solution  method  (“CG-exact”),  an  e-optimal  solution  was  obtained  after  an  average  of  1 1  to 
12  iterations,  for  s  =  10-8.  Interestingly,  the  number  of  iterations  has  exhibited  rather  little  dependence  on  the 
problem  size:  for  example,  instances  with  N  =  5,000,  N  =  10,000,  and  N  =  20,000  required  an  average  of  1 1.2, 
11.4,  and  11.5  iterations,  respectively,  to  be  solved  within  a  I  0~  8  accuracy. 


4.3  Polyhedral  approximations  and  cutting  plane  techniques  for  rational-order  mixed- 
integer  pOCP  problems 

The  approaches  to  constructing  and  solving  polyhedral  approximations  of  pOCP  problems  (1)  described  above,  can 
also  be  efficiently  applied  to  mixed-integer  extensions  of  pOCP  (MIpOCP)  (3);  in  particular,  we  are  considering 
rational-order  MIpOCP  problems,  i.e.,  instances  (3)  where  all  pt  are  rational:  pk  =  /yc  j . 
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The  existing  literature  on  mixed-integer  programming  problems  with  conic  constraints  is  relatively  limited,  with  the 
majority  of  research  in  this  area  being  focused  on  solving  mixed-integer  problems  with  self-dual  cone  constraints, 
particularly  second-order  cone  and  semidefinite  cone  constraints.  Mixed-integer  second  order  conic  programming 
problems  of  type  (3)  with  p  =  2  have  recently  been  studied  in  [3,  4,  10,  22]  and  some  others.  Particularly, 
£ezik  and  Iyengar  [10]  discuss  application  of  Chvatal-Gomory  and  disjunctive  cuts  for  0-1  conic  programming. 
Vielma  et  al.  [22]  proposed  a  branch-and-bound  algorithm  for  mixed-integer  second-order  cone  programming 
(MISOCP)  problems  that  allows  for  significant  computational  savings  by  employing  Ben-Tal-Nemirovski’s  lifted 
polyhedral  approximation  of  the  SOCP  relaxation  at  each  node  of  the  branch-and-bound  tree  instead  of  solving 
the  nonlinear  SOCP  relaxation  itself,  which  is  only  invoked  when  an  integer-valued  solution  of  the  polyhedral 
approximation  is  found,  and  is  used  to  declare  incumbent  or  branch  further.  Atamtiirk  and  Narayanan  [3,  4] 
developed  mixed-integer  rounding  cuts  for  MISOCP  problems,  as  well  as  lifted  cuts  for  general  mixed-integer 
cone  programming  problems,  which  were  then  applied  to  derive  lifted  cuts  for  0-1  MISOCP  problems.  These 
techniques  were  extended  to  the  case  of  general  MIpOCP  problems  with  p  ^  2  in  [23].  In  another  recent  work  by 
Belotti  et  al.  [7],  nonlinear  disjunctive  conic  cuts  for  MISOCP  problems  were  proposed. 

In  this  study  of  MIpOCP  problems  (3),  we  follow  the  approach  of  Vielma  et  al.  [22],  i.e.,  instead  of  solving  a 
nonlinear  pOCP  relaxation  of  (3)  at  each  node  i  of  the  branch-and-bound  tree. 


min  cTx  +  dTz 
s.  t.  Ax  +  Bz  <  b, 

\\C(k)x  +  D(k)z  +  e(A) ||w  <  h(*)Tx  +  g(k)Jz  +  f(k\  k  =  1, . . . ,  K, 
x  e  R",  z(,)  <  z  <  z(,), 


we  solve  its  polyhedral  approximation 


min  cTx  +  dTz 
s.  t.  Ax  +  Bz  <  b, 

(C(Ox  +  D(Oz  +  \ 

h(«TX  +  g(*)TZ  +  /<*>  I  >  0, 

W«  '  ) 

X  6  1",  z(,)  <  Z  <  Z0), 


k  =  \, ...  ,K, 


(53) 


(54) 


where  z(l\  z{,'>  are  the  lower  and  upper  bounds  on  the  relaxed  values  of  variables  z,  and  the  approximation  matrix 
H is  constructed  using  lifting  procedure  (9)  and  applying  gradient  polyhedral  approximation  (13)  to  the 
resulting  3D  p-cones.  In  particular,  we  employ  the  fast  cutting  plane  scheme  for  polyhedral  gradient  approximation 
presented  in  Section  3.2  to  solve  the  LP  problem  (54)  at  each  node  of  the  tree. 

Only  when  an  integer-valued  solution  of  (54)  is  found,  in  order  to  check  its  feasibility  with  respect  to  the  exact 
nonlinear  formulation  (3)  and  declare  incumbent  or  branch  further,  the  exact  pOCP  relaxation  (53)  of  MIpOCP 
must  be  solved  with  bounds  on  the  relaxed  values  of  variables  z  determined  by  the  integer-valued  solution  in 
question  (see  [22]  for  details).  To  solve  the  pOCP  relaxation  (53)  exactly,  we  reformulate  (53)  in  the  SOCP  form 
by  representing  p-ortler  cone  constraints  via  a  set  of  second-order  cones.  Such  a  representation  is  available  for 
rational-order  cones  (see,  e.g.,  [1,  8,  20]),  but  it  is  generally  non-unique  and  requires  O ( N  log  r)  three-dimensional 
rotated  quadratic  cones  to  represent  ( N  +  1) -dimensional  p-cone  with  p  =  r / s  [15].  We  use  the  “economical” 
SOCP  representation  of  rational-order  cones  due  to  Morenko  et  al.  [18],  which  allows  for  replacing  an  (r/.v)-cone 
in  M^+1  with  exactly  [log2  r]  N  quadratic  cones;  in  application  to  (53)  with  pk  =  / Sk  it  yields  a  SOCP  problem 
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cTx  +  dTz 
Ax  +  Bz  <  b, 

(r(fc)x  4.  t)(*)z  4.  e(0 
h(A-)Tx  +  g(A)Tz  +  y(*) 

w<*> 

X  €  M" ,  z  <  z  <  z, 

where  5^*^  is  a  set  of  /V),  [Tog2  r*]  “rotated”  quadratic  three-dimensional  cones  of  the  form  £0  -  £i£2  that  is 
equivalent  to  the  original  (Nk  +  l)-dimensional  pk- cone. 

In  summary,  the  proposed  branch-and-bound  method  for  MIpOCP  problems  relies  primarily  on  a  polyhedral  ap¬ 
proximation  (54)  of  the  problem’s  continuous  relaxation  that  is  solved  using  the  fast  cutting  plane  generation 
technique.  Additionally,  a  SOCP  solver  is  called  to  obtain  an  exact  solution  of  the  SOCP  reformulation  (55)  of 
the  MIpOCP  relaxation  when  a  new  incumbent  solution  is  found.  Alternatively,  the  exact  cutting-plane  algorithm 
described  in  Proposition  6  can  be  used  to  solve  the  MIpOCP  relaxation  (53)  for  each  new  incumbent  solution.  In 
our  computational  experiments,  the  choice  of  one  or  the  other  exact  solution  method  did  not  have  a  noticeable 
effect  on  the  overall  performance,  since  the  bulk  of  the  computational  time  is  spent  at  the  non-integer  nodes  of  the 
branch-and-bound  tree,  and  calls  to  an  exact  solver  were  made  only  occasionally. 

The  described  polyhedral  approximation-based  approach  to  solving  MIpOCP  problems  was  coded  in  C++  using 
CPLEX  Concert  Technology.  In  particular,  the  cutting  plane  scheme  for  solving  the  polyhedral  approximation  (54) 
of  the  relaxation  (53)  of  the  MIpOCP  problem  was  implemented  using  CPLEX’s  callback  functionality,  and  the 
SOCP  reformulation  (55)  of  (53)  was  solved  using  CPLEX  Barrier  solver. 

The  computational  performance  of  this  algorithm  (referred  to  as  BnB/CP  below)  was  compared  to  that  of  the 
standard  CPLEX  12.2  MIP  Barrier  solver,  which  was  employed  to  solve  MIpOCP  problems  in  the  SOCP  reformu¬ 
lation: 

min  cTx  +  dTz 
s.  t.  Ax  +  Bz  <  b, 

C(Ox  _|_  j)(*)z  -|-  e(0  \ 

h«Tx  +  gWTz  +  /<*>  I  e  SNk,  ,  k  =  1, . . . ,  K, 

w(o  )  klk 

x£l",  zeP, 

where,  as  before,  <S^*  denotes  the  set  of  second-order  cones  equivalent  to  a  ( /V/(  +  1  )-dimensional  (/'/.  /.v/c  )-cone 
constructed  in  accordance  with  [18]. 

Namely,  the  BnB/CP  algorithm  and  CPLEX  MIP  Barrier  solver  were  applied  to  MIpOCP  problems  with  p  =  3.0 
in  the  form  of  portfolio  optimization  with  cardinality  constraints  (50)  and  lot-buying  constraints  (51)  of  various 

sizes  (number  of  integer  variables  n  =  50, 100, 200,  dimensionality  of  p-cone  N  =  250 .  1500).  The  results 

are  summarized  in  Tables  2  and  3,  respectively,  where  the  running  times  are  averaged  over  20  instances.  Observe 
that  in  the  case  of  cardinality-constrained  portfolio  optimization  problems,  the  proposed  BnB/CP  method  is  inferior 
to  the  standard  CPLEX  MIP  Barrier  solver  on  smaller  instances,  and  outperforms  it  on  larger  instances.  This  trend 
is  confirmed  by  the  numerical  experiments  on  portfolio  optimization  problems  with  lot-buying  constraints,  which 
are  generally  harder  to  solve  than  the  cardinality-constrained  problems.  In  this  latter  case,  the  BnB/CP  method 
dominates  the  standard  CPLEX  MIP  Barrier  solver  on  all  problem  instances.  Moreover,  it  is  important  to  point 
out  that  CPLEX  12.2  employs  its  own  polyhedral  approximations  of  second-order  cones  for  solving  MISOCP 
problems,  and  the  results  presented  in  Tables  2  and  3  demonstrate  the  contribution  of  the  proposed  fast  cutting 
plane  techniques  for  solving  the  polyhedral  approximations  of  conic  programming  problems. 

Note  that  the  chosen  value  of  the  parameter  p  =  3.0  in  (50)  and  (51)  provided  for  conditions  in  which  the 
SOCP  reformulation  approach  would  be  most  competitive  with  the  proposed  BnB  method.  In  accordance  with  the 
above,  the  value  of  p  =  3  allows  for  the  smallest  number,  [~ log2  3]  N  =  2 N,  of  quadratic  cones  in  the  SOCP 


€  S, 


Nk 

rk/sk  ’ 


1 


(55) 


of  the  form 

min 
s.  t. 
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reformulation  when  p  =  r/s  /  2.  Larger  number  of  quadratic  cones  in  MISOCP  reformulations  of  rational-order 
MIpOCP  generally  lead  to  longer  solution  times,  while  the  size  of  polyhedral  approximations  used  in  the  proposed 
BnB  method  does  not  depend  on  p,  resulting  in  relatively  constant  solution  times. 


N 

n  = 

Barrier  MIP 

50 

BnB/CP 

n  =  100 

Barrier  MIP  BnB/CP 

n  = 

Barrier  MIP 

200 

BnB/CP 

250 

8.43 

11.96 

13.12 

14.56 

21.45 

32.90 

500 

11.67 

15.43 

37.68 

36.79 

60.11 

65.87 

1000 

12.77 

19.58 

38.18 

35.40 

89.36 

75.81 

1500 

33.80 

47.01 

107.27 

92.63 

284.44 

190.46 

Table  2:  Average  running  times  (in  seconds)  for  BnB/CP  implementation  of  portfolio  optimization  problem  with  cardinality 
constraint  (50)  and  p  =  3.0,  benchmarked  against  IBM  ILOG  CPLEX  12.2  MIP  Barrier  solver  applied  to  SOCP  reformulation 
of  (50).  Better  running  times  are  highlighted  in  bold. 


n  =  50  n  =  100  n  =  200 


N 

Barrier  MIP 

BnB/CP 

Barrier  MIP 

BnB/CP 

Barrier  MIP 

BnB/CP 

250 

38.46 

27.91 

114.77 

82.92 

1020.84 

743.22 

500 

99.41 

55.17 

339.63 

254.41 

2163.89 

1196.76 

1000 

586.51 

506.10 

2666.62 

2395.59 

1.99% 

1.18% 

Table  3:  Average  running  times  (in  seconds)  for  BnB/CP  implementation  of  portfolio  optimization  problem  with  lot-buying 
constraints  (51)  and  p  =  3.0,  benchmarked  against  IBM  ILOG  CPLEX  12.2  MIP  Barrier  solver  applied  to  SOCP  reformulation 
of  (51).  Better  running  times  are  highlighted  in  bold,  and  XX%  denotes  the  integrality  gap  after  1  hour. 


5  Conclusions 

In  this  paper  we  discussed  the  use  of  polyhedral  approximations  in  the  context  of  solving  linear  and  mixed-integer 
programming  problems  with  p- order  cone  constraints.  In  particular,  we  showed  that  the  fast  cutting-plane  method 
for  solving  pOCP  problems  originally  proposed  by  Krokhmal  and  Soberanis  [15]  for  a  special  case  of  gradient 
approximation  of  /;-cones,  which  allows  for  cut  generation  in  a  constant  time  independent  of  the  approximation 
accuracy,  can  be  extended  to  a  broader  class  of  polyhedral  approximations.  Moreover,  a  variation  of  this  approach 
is  proposed  that  constitutes  an  exact  pOCP  solution  method  with  0(s~])  iteration  complexity.  In  addition,  we 
show  that  generation  of  cutting  planes  in  a  time  that  is  independent  of  the  approximation  accuracy  is  available 
for  the  lifted  polyhedral  approximation  of  second-order  cones  due  to  Ben-Tal  and  Nemirovski  [9],  which  is  itself 
recursively  constructed,  with  the  number  of  recursion  steps  being  dependent  on  the  desired  accuracy.  Finally,  it 
is  demonstrated  that  the  developed  cutting  plane  techniques  can  be  effectively  applied  to  obtain  exact  solutions  of 
mixed-integer  /;-ordcr  cone  programming  problems. 
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Abstract 

In  this  work,  we  consider  a  class  of  risk-averse  maximum  weighted  subgraph  problems  (R- 
MWSP).  Namely,  assuming  that  each  vertex  of  the  graph  is  associated  with  a  stochastic  weight, 
such  that  the  joint  distribution  is  known,  the  goal  is  to  obtain  a  subgraph  of  minimum  risk  satisfying 
a  given  hereditary  property.  We  employ  a  stochastic  programming  framework  that  is  based  on  the 
formalism  of  modern  theory  of  risk  measures  in  order  to  find  minimum-risk  hereditary  structures  in 
graphs  with  stochastic  vertex  weights.  The  introduced  form  of  risk  function  for  measuring  the  risk 
of  subgraphs  ensures  that  optimal  solutions  of  R-MWS  problems  represent  maximal  subgraphs.  A 
graph-based  branch-and-bound  algorithm  for  solving  the  proposed  problems  is  developed  and  illus¬ 
trated  on  a  special  case  of  risk-averse  maximum  weighted  clique  problem.  Numerical  experiments 
on  randomly  generated  Erdos-Renyi  graphs  demonstrate  the  computational  performance  of  the  de¬ 
veloped  branch-and-bound  algorithm. 

Keywords:  Risk-averse  maximum  weighted  subgraph  problem,  risk-averse  maximum  clique  prob¬ 
lem,  maximum  weight  clique  problem,  stochastic  weights,  coherent  risk  measures 


1  Introduction  and  motivation 

For  decades,  network  problems  with  topologically  exogenous  information  have  occupied  a  prominent 
place  in  the  graph  theory  and  network  science  literature.  A  popular  class  of  problems  of  this  type  involves 
finding  a  subset  of  minimum  or  maximum  weight  and  conforming  to  a  prescribed  structural  property  in 
a  graph  whose  vertices  are  characterized  by  deterministic  weights  [4,  5,  14,  22,  25].  Several  influential 
studies  have  established  a  foundation  for  exact  combinatorial  solution  algorithms  for  such  problems 
[6,  11,  26].  Most  notably,  Carraghan  and  Pardalos  [11]  developed  a  backtracking  branch-and-bound 
method  for  efficiently  solving  the  maximum  clique  problem  by  exploiting  the  hereditary  property  [30] 
of  complete  subgraphs.  Many  extensions  of  their  work  improved  upon  the  process  of  reducing  the 
search  space  by  using  vertex  coloring  schemes  for  branching  and  for  obtaining  upper  bounds  on  the 
maximum  achievable  subgraph  order  (see,  e.g.,  [10,  17,  29]).  Analogous  weight-based  procedures  have 
also  been  used  when  seeking  a  maximum  weight  subgraph  in  the  presence  of  deterministic  vertex  weights 
[4,21,25]. 

Significant  emphasis  has  also  been  placed  on  network  problems  with  uncertain  exogenous  information 
evidenced  in  various  forms  that  influences  the  overall  topology,  flow  distribution  and  costs,  etc.  Partic¬ 
ularly  common  arc  considerations  of  stochastic  factors  in  context  of  network  flow  and  vehicle  routing 
problems  where  uncertainties  arc  attributed  to  arc  capacities  or  node  demands  [3,  9,  15,  16].  Also,  a 
number  of  studies  examined  the  effects  of  probabilistic  arc  failures  in  networks  [1,  31]  and  introduced 
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risk-based  approaches  to  minimize  the  corresponding  flow  losses  [8,  28].  The  problem  of  finding  a 
subset  of  vertices  of  maximum  cardinality  that  form  a  clique  with  a  specified  probability,  given  that 
edges  in  the  graph  can  fail  with  some  probabilities,  is  studied  in  [23];  a  similar  approach  in  application 
to  certain  clique  relaxations  is  pursued  in  [34],  Although  uncertainties  in  most  of  the  aforementioned 
cases  influence  decisions  related  to  directed  network  flows,  far  less  emphasis  has  been  placed  on  exam¬ 
ining  decision  making  regarding  optimal  subgraph  topologies  and  resource  allocation  in  settings  where 
uncertainties  arc  induced  by  stochastic  factors  associated  with  network  vertices. 

In  this  work,  we  employ  a  stochastic  programming  framework  that  is  based  on  formalism  of  risk  measures 
[18],  and  in  particular,  coherent  risk  measures  [2,  12],  in  order  to  find  minimum-risk  structures  in  graphs 
with  stochastic  vertex  weights.  Namely,  we  consider  a  class  of  risk-averse  maximum  weighted 1  subgraph 
problems  (R-MWSP)  that  represent  a  stochastic  extension  of  the  so-called  maximum  weight  subgraph 
problems  considered  in  the  literature  in  the  context  of  hereditary  graph-theoretical  properties.  We  pro¬ 
pose  a  graph-based  branch-and-bound  algorithm  for  solving  problems  in  the  R-MWSP  class,  which  is 
generally  applicable  to  maximum  weight  subgraph  problems  where  a  subgraph’s  weight  is  given  by  a 
super-additive  function  whose  evaluation  requires  solving  an  optimization  problem.  As  an  illustrative 
example  of  the  proposed  concepts,  we  consider  a  risk-averse  maximum  weighted  clique  problem. 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section  2  we  introduce  the  general  formulation 
of  R-MWS  problems  and  discuss  their  properties.  Section  3  presents  solution  methods  for  R-MWSP,  in¬ 
cluding  a  mathematical  programming  formulation  and  a  graph-based  (combinatorial)  branch-and-bound 
method.  Finally,  Section  4  considers  a  numerical  case  study  on  solving  risk-averse  maximum  weighted 
clique  problems,  where  risk  is  quantified  using  a  class  of  nonlinear  coherent  risk  measures,  in  randomly 
generated  graphs  with  various  densities. 


2  Risk-averse  stochastic  maximum  vertex  problem 

Let  G  =  (V.  E )  be  an  undirected  graph  where  each  vertex  i  e  V  has  a  positive  weight  u>i  >  0.  For  any 
subset  S  of  its  vertices,  let  G  [.S']  denote  the  subgraph  of  G  induced  by  S,  i.e.,  a  graph  such  that  any  of  its 
vertices  i,  j  are  connected  by  an  edge  if  and  only  if  (/,  j )  is  an  edge  in  G. 

Property  FI  is  said  to  be  hereditary  with  respect  to  induced  subgraphs  ( hereditary  for  short)  if  for  any 
graph  satisfying  IT  the  removal  of  a  vertex  preserves  II  in  the  resulting  induced  subgraph.  Examples  of 
hereditary  properties  include  “complete”',  “independent” ,  or  “stable”',  “degree  constrained”',  “planar”, 
etc.  Given  a  hereditary  property  II,  it  may  be  of  interest  to  find  a  subgraph  of  G  that  satisfies  n  and  has 
the  largest  additive  weight,  which  is  known  as  the  maximum  weight  subgraph  problem,  or  the  maximum 
weight  II  problem'. 

max  |  wi  '■  G[5]  satisfies  Fl|.  (1) 

A  subgraph  of  G  that  satisfies  II  and  whose  order  cannot  be  further  increased  without  violating  II 
is  known  as  a  maximal  H-subgraph;  the  largest  such  subgraph  represents  the  maximum  II -subgraph. 
Obviously,  an  optimal  solution  of  the  maximum  weight  II  problem  (1)  is  necessarily  a  maximal  II- 
subgraph,  but  may  not  be  its  maximum  II -subgraph. 

'The  rationale  for  the  chosen  terminology  is  explained  in  Remark  1. 
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Finding  subgraphs  of  maximum  weight  with  hereditary  properties  represents  a  large  and  important  class 
of  graph  theoretical  problems.  A  seminal  result  regarding  maximum  subgraph  problems  with  hereditary 
properties  was  established  by  Yannakakis  [33].  Particularly,  property  If  is  called  nontrivial  if  it  is  sat¬ 
isfied  by  a  single-vertex  graph  and  not  satisfied  by  every  graph,  and  is  called  interesting  if  the  order  of 
graphs  satisfying  FI  is  unbounded.  Then,  the  following  holds: 

Theorem  1  (Yannakakis  [33])  If  property  IT  is  hereditary  with  respect  to  induced  subgraphs,  nontriv¬ 
ial,  and  interesting,  then  the  maximum  FI  problem 

max  || S |  :  G [.S']  satisfies  IT} 


is  NP-complete. 

It  is  straightforward  that  the  statement  of  this  theorem  extends  to  the  version  of  the  maximum  weight  FI 
problem  (1).  Some  of  the  most  well  known  instances  of  (1)  include  the  maximum  weight  clique  problem 
(MWCP)  and  maximum  weight  independent  set  problem. 

Now  we  pose  the  question  that  served  as  motivation  for  the  present  endeavor:  What  if  the  vertex  weights 
Wi  are  uncertain?  In  this  case,  extending  the  deterministic  formulation  (1)  into  the  stochastic  domain  is 
not  straightforward  and  requires  additional  considerations.  Indeed,  minimization  of  the  random  quantity 
that  is  represented  by  the  sum  of  random  weights  in  (1)  is  ill -posed  in  the  context  of  decision  making 
under  uncertainty  that  requires  a  deterministic  optimal  solution.  Therefore,  the  sum  of  stochastic  weights 
in  the  objective  has  to  be  replaced  with  a  statistical  functional  that  utilizes  the  distributional  information 
about  the  weights’  uncertainties.  The  traditional  stochastic  optimization  approach,  for  example,  involves 
seeking  the  best  “expected  outcome”,  which  in  this  setting  would  translate  into  maximizing  the  expected 
weight  of  an  induced  subgraph  G[5],  It  is  easy  to  see,  however,  that  maximization  of  the  expected 
subgraph  weight  trivially  reduces  to  the  deterministic  maximum  weight  II  formulation  with  expected 
vertex  weights:  E(  JfieS  Wj)  =  JfieS  E utj. 

In  this  work,  we  pursue  a  risk-averse  approach  and  consider  the  problem  of  finding  the  subgraph  of  G 
that  satisfies  property  II  and  has  the  lowest  risk.  Namely,  let  X,  denote  a  stochastic  variable  that  is 

associated  with  vertex  i  e  V  and  assume  that  the  joint  distribution  of  vector  Xq  =  (X\ . Yjq)  is 

known.  Assuming  that  the  random  quantities  Xj,  i  €  V,  represent  costs  or  losses,  consider  the  problem 
of  finding  the  minimum-risk  subgraph  in  G  with  property  II,  or  the  risk-averse  maximum  weighted  II 
problem: 


min  {&(S;Xg)  :  G[S]  satisfies  II}.  (2) 

In  formulation  (2),  the  functional  Xq)  quantifies  the  risk  of  the  induced  subgraph  G[S]  given  the 
distributional  information  Xq  ,  and  is  undefined  as  yet. 

In  order  to  formally  define  the  risk  Xg)  of  a  subgraph  G  [.S']  in  (2),  we  invoke  the  concept  of  risk 
measure  that  is  well  known  in  stochastic  optimization  literature  [18].  Namely,  given  a  probability  space 
(12 ,  T ,  P),  where  £2  is  the  set  of  random  events,  T  is  the  a -algebra,  and  P  is  the  probability  measure,  a 
risk  measure  is  defined  as  a  mapping  p  :  X  m>-  M,  where  A  is  a  linear  space  of  ^-measurable  functions 
X  :  12  Hr-  M.  This  basic  definition  is  typically  augmented  by  additional  properties,  such  as  convexity, 
monotonicity,  etc.  (see  below)  that  are  dictated  by  applications. 
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Then,  given  a  risk  measure  p  that  we  additionally  assume  to  be  lower  semi-continuous  (l.s.c.),  the  risk 
XG)  of  a  subgraph  of  G  induced  on  a  set  of  vertices  S  C  V(G)  with  uncertain  vertex  weights  X, 
can  be  defined  as  an  optimal  value  of  the  following  stochastic  programming  problem: 

d#(S;XG)  =  min  jp|  ^ UjxA  :  ^  Uj  —  1,  w*  >  0,  i  €  S | .  (3) 

'  '  ieS  '  ieS  ’ 

Recall  that  function  /  :  X  i— M  is  l.s.c.  if  and  only  if  the  sets  {X  e  X  :  f(X)  <  a}  are  closed  for  all 
a  e  M.  Obviously,  lower  semi-continuity  of  risk  measure  p  is  necessary  for  the  minimization  problem  in 
(3)  to  be  well-posed.  In  the  sequel,  it  will  be  implicitly  assumed  that  the  risk  measure  p  in  (3)  is  l.s.c. 

The  rationale  behind  definition  (3)  of  subgraph  risk  function  M(-)  is  that,  similarly  to  many  “nice”  risk 
measures,  such  as  those  discussed  below,  it  allows  for  risk  reduction  through  diversification: 

Proposition  1  Given  a  graph  G  —  ( V,  E )  with  stochastic  weights  Xl:  i  e  V,  and  a  l.s.c.  risk  measure 
p,  the  subgraph  risk  function  M  defined  by  (3)  satisfies 

M(S2\XG)  <  M(SGXG)  for  all  Si  c  S2.  (4) 


Proof:  For  Si  c  S2,  denote 


e  argmin  ]p( 

^  '  UiXi  j  .  ^  '  Hi  —  1 

;  Hi  >0,  i  e  > , 

k  =  1,2 

(  V 

ieSk  7  i  €Sk 

) 

Then,  one  immediately  has 

ms2:XG)  =  p(j2  ui2)Xi)  -p(n  ui1)Xi  +  E  0  •  x.i )  =  &(SgXG), 

' ies2  '  ^ ieSi  jeS2\Si  ' 

due  to  lower  semicontinuity  of  risk  measure  p.  □ 

Note  that  the  power  of  definition  (3)  via  solution  of  a  stochastic  programming  problem  is  evidenced  in 
the  fact  that  the  property  (4)  of  risk  reduction  via  diversification  property  holds  for  any  l.s.c.  risk  measure 
p  :  X  i — K.  Secondly,  property  (4)  implies  the  following  important  observation  regarding  the  optimal 
solution  of  the  risk-averse  maximum  weighted  n  problem  (2): 

Corollary  1  There  exists  an  optimal  solution  of  the  risk-averse  maximum  weighted  FI  problem  (2)  with 
lX(S\  XG )  defined  by  (3)  that  is  a  maximal  Tl-subgraph  in  G. 


Remark  1  The  introduced  problem  (2)  of  finding  minimum-risk  subgraphs  with  risk  defined  by  (3) 
is  strongly  related  to  the  class  of  maximum-weight  subgraph  problems  (1),  in  the  sense  that  both  are 
concerned  with  weighted  graphs,  and  their  optimal  solutions  can  be  represented  by  maximal  subgraphs; 
however,  in  contrast  to  (1),  an  optimal  solution  of  (2)— (3)  is  not  a  subgraph  of  maximum  “weight”.  To 
emphasize  the  similarities  and  differences  with  (1),  we  call  the  risk-minimization  problem  (2)  a  “risk- 
averse  maximum  weighted  subgraph  problem”. 
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In  this  respect,  it  is  worth  mentioning  that  the  presented  framework  differs  from  other  recent  studies 
that  also  utilized  formally  defined  risk  measures  for  quantifying  the  risk  in  graphs,  but  relied  on  explicit 
maximization  of  the  subgraph’s  cardinality  or  weight  while  requiring  that  its  risk  be  bounded  (see,  e.g., 
[23,  34]): 

min  {|S|  :  Risk(S')  <  c o,  G[S]  satisfies  n}. 

Indeed,  the  proposed  definition  (3)  of  risk  function  38  in  the  R-MWS  problem  (2)  implies  that  maximiza¬ 
tion  of  a  solution’s  cardinality  is  a  consequence  of  risk  minimization  via  diversification. 

Further  properties  of  38(S ;  Xq)  depend  on  those  of  the  risk  measure  p  in  (3).  In  this  work  we  assume  p 
to  belong  to  a  family  of  coherent  measures  of  risk.  According  to  [2],  risk  measure  p  is  called  coherent  if 
it  satisfies  the  following  four  properties  (axioms): 

(Al)  monotonicity:  p(X)  <  p(Y)  for  all  IJef  such  that  X  <  F; 

(A2)  subadditivity:  p(X  +  Y )  <  p(X )  +  p(Y)  for  all  X.  Y  €  X; 

(A3)  positive  homogeneity:  p( XX)  =  A p(X)  for  all  A  e  X  and  A  >  0; 

(A4)  transitional  invariance:  p(X  +  a)  =  p(A)  +  a  for  all  A  €  X  and  a  el. 

An  intuitive  interpretation  of  the  above  axioms  is  as  follows.  Axiom  (Al)  guarantees  that  lower  losses 
yield  lower  risk.  The  sub-additivity  axiom  (A2)  is  important  in  the  context  of  risk  reduction  via  diversi¬ 
fication.  It  is  also  of  fundamental  significance  from  the  optimization  viewpoint,  since  it  yields,  together 
with  the  positive  homogeneity  axiom  (A3),  the  all-important  convexity  property: 

p(AA  +  (1  -  A)F)  <  Ap(A)  +  (1  -  A)p(Y)  for  all  A  ,Y  eX,  A  e  [0, 1], 

The  positive  homogeneity  property  (A3)  postulates  that  losses  and  risk  scale  correspondingly.  Axiom 
(A4)  ensures  that  a  constant  change  in  A  will  translate  equivalently  in  risk  p(A). 

The  next  proposition  states  that  when  the  risk  measure  p  in  (3)  is  coherent,  or  at  least  possesses  properties 
(Al),  (A3),  (A4),  then  the  corresponding  subgraph  risk  function  38  (S ;  Xq)  satisfies  properties  analogous 
to  (Al),  (A3),  (A4)  with  respect  to  the  stochastic  weights  vector  Xq. 

Proposition  2  Let  G  =  (V,  E)  be  an  undirected  graph,  and  Xq  —  (Ai, . . . ,  X\y\),  and  Y q  = 
( Y\ . ....  Y\y  i)  be  vectors  of  stochastic  weights  whose  components  are  defined  on  the  same  linear  space 
X.  If  the  risk  measure  p  in  (3)  is  l.s.c.  and  satisfies  axioms  (Al ),  (A3 ),  and  (A4 )  of  coherency,  then  for  any 
induced  subgraph  G[S]  the  subgraph  risk  function  38  defined  in  (3)  satisfies  the  following  properties: 

(Gl)  38(S;Xg)  <  38(S',YG)  for  all  XG  <\G; 

(G2)  38(S\XXq)  =  X38(S ;  XG)for  all  Xq  and  X  >  0; 

(G3)  38(S;Xq  +  a\)  =  38(S\Xq)  +  afar  all  a  e  M; 

where  1  is  the  vector  of  ones,  and  the  vector  inequality  Xq  <  Yq  is  interpreted  component-wise. 
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Proof:  Consider,  for  example,  property  (Gl).  Denoting,  as  before, 


we  have 


uz  <E  argmin  jpl  ^n,  Z,  J  :  =  1;  w,-  >  0,  i  e  S>, 

'  ' ieS  '  ieS  ' 

&(S;XG)  =  p(j2u?xi)±p{T,uixi)- 


ieS 

On  the  other  hand,  from  Xt  <  Y,  it  follows  that 


whence 


Xi  -  J2uIYi’ 

ieS  ieS 

<p(^Mfy!)  =^(S;YG). 


ieS  7  vieS 

Properties  (G2)  and  (G3)  are  verified  similarly. 


□ 


Observe  that  M(S\  XG )  does  not  obey  the  sub-additivity  with  respect  to  the  stochastic  weights,  i.e.,  in 
general 

&(S;  XG  +  Yg)  £  <%(S; XG)  +  &(S\ Yg). 

With  respect  to  the  traditional  risk  measures  p  :  A’  M,  the  failure  to  satisfy  the  sub-additivity  require¬ 
ment  (or,  if  positive  homogeneity  also  does  not  hold,  the  convexity  requirement)  implies  that  such  a  risk 
measure  is  ill  fitting  for  risk  reduction  via  diversification.  In  other  words,  it  is  possible  that  diversification 
can  result  in  an  increased  risk  exposure,  as  measured  by  a  non-subadditive  (correspondingly,  nonconvex) 
risk  measure  p. 

In  the  context  of  proposed  risk  function  3%  for  subgraphs,  risk  reduction  via  diversification  is  already 
ascertained  by  (4),  which,  with  respect  to  the  problem  of  finding  a  II-subgraph  with  the  smallest  risk, 
ensures  that  adding  new  vertices  to  the  existing  feasible  solution  that  satisfies  a  hereditary  property  II 
is  always  beneficial,  provided  that  n  is  not  violated  by  the  addition  of  new  vertices.  Yet,  under  an 
additional  assumption  that  the  stochastic  vertex  weights  have  non-negative  support,  i.e.,  XG  >  0,  the 
subgraph  risk  function  3&(S ;  Xq)  can  be  shown  to  be  “set-subadditive”.  Namely,  one  has 


Proposition  3  Let  the  stochastic  vertex  weights  Xi,  i  e  V.  of  graph  G  =  ( V,  E)  satisfy  Xi  >  0,  i  e  V. 
Then,  for  any  ,S'i ,  .S4  d  V  the  subgraph  risk  function  M{S\  X(; )  defined  by  (3)  satisfies 

msi  u  S2;Xg)<&(S1;Xg)+&(S2;Xg),  (5) 

provided  that  the  risk  measure  p  in  (3)  is  l.s.c.  and  satisfies  (A1 )  and  (A2). 


Proof:  If  p  satisfies  axioms  (Al)  and  (A2),  then  p(X )  >  0  for  any  X  >  0.  Immediately,  one  has 
&(Si  u  S2;  XG)  <  &(Si :  XG)  <  &(Si :  XG)  +  3?(S2;  XG).  □ 

Naturally,  in  the  context  of  risk-averse  maximum  weighted  II  problems  where  II  is  hereditary,  one 
should  also  require  that  S\,  S2,  and  5)  U  S2  satisfy  II. 

Note  that  the  assumption  of  nonnegative  support  for  vertex  weights  A,  is  analogous  to  the  standard 
assumption  of  positive  vertex  weights  in  hereditary  maximum  weight  subgraph  problems  such  as  the 
maximum  clique  and  independent  set  problems  [5,  27]. 
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3  Solution  approaches  for  risk-averse  maximum  weighted  subgraph 
problems 

In  this  section  we  consider  a  mathematical  programming  formulation  for  the  R-MWS  II  problem  (2), 
where  the  risk  M{S)  of  induced  subgraph  G  [.S']  is  defined  as  in  (3),  and  propose  a  graph-based,  or 
combinatorial  branch-and-bound  algorithm  that  represents  an  extension  of  the  well-known  branch-and- 
bound  schemes  for  the  maximum  clique  problem  [1 1,  25,  26], 

3.1  A  mathematical  programming  formulation 

Given  a  graph-theoretic  property  n,  let  binary  decision  variables  x,  indicate  whether  node  i  €  V  belongs 
to  a  subset  S,  such  that  the  induced  subgraph  G[5]  satisfies  II: 

1,  i  e  S  such  that  G[S]  satisfies  II 
0,  otherwise. 

Further,  let  IIG  (x)  <  0  denote  the  structural  constraints  such  that  for  any  x  e  {0, lpFl,  IIG(x)  <  0  if 
and  only  if  G  [.S']  satisfies  II,  where  S  —  {i  €  V  :  Jc*  =  1}.  Then,  the  following  proposition,  which  we 
give  without  proof,  formalizes  a  mathematical  programming  representation  for  the  risk-averse  maximum 
weighted  fl  problem  (2)  with  risk  M(S\  XG)  defined  by  (3)  if  the  property  II  is  hereditary  on  induced 
subgraphs: 

Proposition  4  Let  G  —  (V,  E)  be  an  undirected  graph  with  stochastic  vertex  weights  Xj,  i  e  V,  and 
II  be  a  property  hereditary  on  induced  subgraphs.  Then,  the  R-MWS  II  problem  (2)  with  risk  defined  by 
(3)  can  equivalently  be  represented  as  a  mixed  0-1  programming  problem 

min  p(uTXG) 
s.  t.  uTl  =  1 

u  <  x  (6) 

nG(x)  <0 

xefO.lf1,  uelj1. 

When  the  property  II  in  (6)  denotes  graph  completeness,  one  can  choose,  for  example,  the  well-known 
edge  formulation  of  the  maximum  clique  problem  (see,  e.g.,  [27])  to  represent  the  structural  constraints 
in  (6)  as 

{x  €  {0, 1}|F|  :  nG(x)  <0}  =  {xe  {0, 1}|F|  :  xt  +  xj  <  1  for  all  (i,  j )  e  E}, 
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where  E  represents  the  complement  edges  of  graph  G ,  whereby  the  mathematical  programming  formu¬ 
lation  of  the  R-MWS  clique  problem  (2)-(3)  takes  the  form 

min  p(  Uj  X,  ) 
ieV 

S.  t.  ^  Ui  =  1 

ieV  (7) 

Ui  <  Xi,  i  €  V 

Xi  +  Xj  <  1,  (, i,j)eE 

Xi  €  {0, 1},  Ui  >0,  i  €  V. 

Formulations  (6)-(7)  allow  for  handling  risk  measures  p  whose  representations  come  in  the  form  of 
mathematical  programming  problems,  and  can  be  solved  with  appropriate  (nonlinear)  mixed  integer 
programming  solvers. 

A  combinatorial  branch-and-bound  algorithm  that  allows  for  exploiting  the  structure  of  problems  (6)-(7) 
imposed  by  the  underlying  graph  G  is  described  next. 

3.2  A  graph-based  branch-and-bound  algorithm 

The  combinatorial  branch-and-bound  (BnB)  algorithm  works  by  navigating  between  “levels”  of  the  BnB 
tree  until  a  subgraph  of  G  that  satisfies  property  IT  and  is  guaranteed  to  be  of  lowest  risk  as  measured 
by  (3)  is  found.  The  algorithm  stalls  at  level  l  —  0  with  a  partial  solution  Q  :=  0,  incumbent  solution 
0*  :=  0,  and  a  global  upper  bound  L*  :=  +oo  on  risk  of  Q*.  Throughout  the  algorithm,  the  partial 
solution  Q  contains  the  vertices  in  V  such  that  G[Q ]  has  property  IT,  and  set  Q*  induces,  per  Corollary 
1,  a  maximal  Tl-subgraph  whose  risk  equals  L*  in  G  hitherto. 

Within  the  current  branch  of  the  BnB  tree,  “level”  t  is  associated  with  the  candidate  set  Ci  of  vertices 
such  that  any  single  vertex  of  Ci  can  be  added  to  the  current  partial  solution  Q  without  violating  property 
n.  Branching  is  performed  by  removing  a  branching  vertex  q  from  Ci  and  adding  it  to  the  partial  solution 
0.  The  algorithm  is  initialized  with  Co  :=  V,  and,  as  soon  as  the  partial  solution  Q  is  updated  after 
branching  at  level  l,  the  corresponding  candidate  set  at  level  t  +  1  is  constructed  by  removing  all  vertices 
from  Ci  whose  inclusion  in  Q  would  break  the  property  If,  i.e., 

Ci+i  {i  e  Ci  :  G[i  U  Q ]  satisfies  IT}.  (8) 

As  a  result,  immediately  after  branching  at  level  t  the  cardinality  of  partial  solution  set  Q  is  equal  to 

IGI=*  +  i. 

The  bounding  step  of  the  BnB  algorithm  involves  evaluating  the  quality  of  the  solution  that  can  be 
obtained  by  exploring  further  the  subgraph  induced  by  vertices  in  O  U  Q+i.  Observe  that  an  exact 
approach  of  directly  finding  the  Tl-subgraph  with  the  lowest  possible  risk  that  is  contained  in  G[Q  U 
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Q+i]  entails  solving  the  following  restriction  of  problem  (6): 

&(Q  U  Q+i;XgO  =  min  p(uTXG) 
s.  t.  uTl  =  1 

U  <  X, 

(9) 

nG(x)  <  o, 

x  g  {0, 1}|F|,  u  g  Rf, 

Xi  =  0,  i  g  V  \  {Q  U  C^+i). 

As  (9)  is  a  (nonlinear)  mixed  0-1  problem,  solving  it  at  every  node  of  the  BnB  tree  is  impractical.  Instead, 
a  lower  bound  on  the  value  of  S%{Q  U  Q+ 1 :  XG)  given  by  (9)  can  be  computed.  However,  in  contrast 
to  the  traditional  mixed  integer  programming  approach  of  constructing  a  lower  bound  by  relaxing  the 
integrality  constraints,  we  formulate  a  lower  bound  problem  by  completely  eliminating  the  0- 1  variables 
Xi  along  with  the  structural  constraints: 

&(Q  U  Q+i;Xg)  >  C(Q  U  Q+i)  :=  min 

s.  t. 


Observe  that  the  structural  constraints  IIG  (x)  <  0  in  problem  (9)  arc  satisfied  by  variables  (x,  :  i  g  Q } 
(since  G[Q\  satisfies  n),  as  well  as  by  variables  {xi  :  i  e  Q  U  jo}  for  each  jo  €  Q+j  (since  G[Q  U  yfi] 
for  each  vertex  jo  in  Q+1  also  satisfies  n,  per  definition  (8)  of  the  candidate  set  Q+i).  Hence,  the 
corresponding  structural  constraints  arc  redundant  in  (9).  On  the  other  hand,  the  structural  constraints 
are  not  necessarily  satisfied  by  variables  {x,-  :  i  G  Q+i}  and  {x;-  :  i  g  Q  U  Q+1},  since  G[Q+1] 
and  G[Q  U  Q+i]  do  not  necessarily  satisfy  n.  Thus,  (10)  is  a  relaxation  of  (9),  and,  by  virtue  of 
Proposition  1,  the  solution  to  (10)  provides  a  lower  bound  on  the  minimum  risk  achievable  in  any  n- 
subgraph  induced  on  the  union  of  Q  with  any  subset  of  Q+1,  i.e., 

£(Q  U  Q+i)  <  U  Q+1;  XG)  <  &{Q  U  S;Xg)  for  any  S  c  Q+1. 

Observe  that  if  l'  —  l  +  1  represents  the  next  level  in  the  BnB  tree,  and  Q'  is  the  corresponding  partial 
solution,  then  due  to  the  definition  (8)  of  candidate  set  one  has 

(e'UQ;+1)C(0UQ+1), 

whence  the  risk  Q  U  Q+p  XG)  does  not  decrease  as  l  increases  (or,  in  other  words,  as  new  vertices 
arc  added  to  the  partial  solution  Q  and  the  algorithm  proceeds  to  deeper  levels  t  of  the  BnB  tree).  We 
next  show  that  this  observation  is  an  effective  bounding  criterion  to  obtain  a  n-subgraph  of  lowest  risk 
in  G. 

Depending  on  the  computed  value  of  £(  Q  U  C^+ j ) ,  the  algorithm  branches  further  or  prunes/backtracks 
as  follows.  If  C(  Q  U  Q+i)  >  L* ,  then  the  vertex  q  is  removed  from  Q  and  the  corresponding  branch  of 
the  BnB  tree  is  fathomed  due  to  the  fact  that  there  exists  no  possibility  of  achieving  a  reduction  in  risk  by 


p(j2Ui  Xi) 

lev 

YhUi  =  X  (10) 

ieV 

Ui  —  0,  i  €  V  \  (Q  U  Q+1) 

Hi  >0,  i  G  Q  U  Q+1. 
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sequential  branching/refinement.  Further,  if  Q  ^  0,  another  branching  vertex  is  selected  and  removed 
from  Q  and  added  to  Q.  Otherwise,  if  Q  =  0,  the  algorithm  backtracks  to  level  l—  1. 

In  the  case  of  £(  Q  U  Q+ j)  <  L*  and  Q+ 1  ^  0,  the  algorithm  proceeds  to  select  a  branching  vertex  q 
at  the  next  level  l  +  \.  If  C(Q  U  Q+1 )  <  L*  and  Q+i  =  0,  the  subgraph  induced  by  the  partial  solution 
0  represents  a  maximal  fl-subgraph  in  G  and  is  declared  as  the  new  incumbent  solution,  O*  :=  Q,  the 
global  upper  bound  on  risk  is  updated  L*  :=  C(Q  U  Q+i),  and  the  algorithm  backtracks  to  level  i  —  1. 

With  regal'd  to  the  branching  rule,  the  observed  computational  performance  suggests  that  branching  on 
a  vertex  q  with  the  smallest  value  of  p(Xq)  or  EA?  is  most  effective.  To  this  end,  vertices  in  the  set 
Co  =  V  are  pre-sorted  during  the  initialization  phase  of  the  algorithm  in  descending  order  with  respect 
to  their  risks  p(  A, )  or  expected  values  E Xj ,  and  then  the  last  vertex  in  Q  is  selected  for  branching. 

The  outlined  branch-and-bound  procedure  for  R-MWS  problems  is  formalized  as  Algorithm  1. 


Algorithm  1:  Graph-based  branch-and-bound  method  for  R-MWSP 


t  Initialize:  l  :=  0;  C0  :=  V;  Q  :=  0;  Q*  :=  0;  L*  :=  oo; 

2  while  (not  STOP)  do 


3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 


if  Q  ^  0  then 

select  a  vertex  q  e  Q ; 

Q  :=  Ct\q; 

Q  :=  2  U  q\ 

Q+ 1  :=  {/  e  Cg  :  i  U  Q  satisfies  II}; 
solve  C(Q  U  Q+i); 
if  C(0  U  Q+1)  <  L*  then 
if  Ci+i  ^  0  then 
|  €:=£+l; 
else 

L*  :=  C(Q  U  Q+1); 

Q-=Q\^ 


else 


0  :=  2  \q\ 

if  l  =  0  then 

[  STOP 


else 


€  t—  1; 

if  i  =  —  1  then 

j  STOP 

2  :=  Q\q\ 


25  return  Q*\ 


Depending  on  the  particular  form  of  risk  measure  p,  evaluation  of  the  lower  bound  by  solving  the  relaxed 
problem  (10)  can  be  relatively  expensive  and  be  a  major  contributor  to  the  overall  computational  cost  of 
the  proposed  algorithm.  Then,  certain  efficiencies  in  computing  the  lower  bound  value  via  (10)  can  be 
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implemented  by  taking  into  account  the  properties  of  the  subgraph  risk  function  8%.  Specifically,  if  at  any 
point  (0  U  Q+ 1 )  C  (O'UC'),  where  O'  and  C  are  a  partial  solution  and  a  candidate  set  for  which  the 
lower  bound  value  C(0'  U  C)  is  known  to  exceed  the  current  global  upper  bound,  C(Q'  U  C)  >  L* , 
then  C{0  U  Q+i)  >  C(Q'  U  C)  >  L*  due  to  Proposition  1.  The  vertex  q  under  consideration  is 
then  removed  from  Q  and  the  corresponding  subproblem  is  fathomed.  In  practice,  however,  retaining 
the  list  of  sets  (Q'  U  Cr)  with  C(Q'  U  C)  >  L*  and  checking  whether  the  current  Q  U  Q+1  is  a 
subset  of  some  Q'  U  C  has  proven  computationally  expensive  for  even  moderately  sized  problems,  and 
is  most  notably  exacerbated  in  graph  topologies  that  contain  a  large  number  of  maximal  II -subgraphs 
(for  example,  when  the  graph  density  increases  in  the  context  of  risk  averse  maximum  weighted  clique 
problem).  Therefore,  a  more  modest  approach  is  considered  where  only  the  vertices  from  incumbent 
solutions  Q*  arc  retained  and  tested  against  unfathomed  sets  (Q  U  Q+i). 


4  Case  study:  Risk-averse  stochastic  maximum  weighted  clique  problem 
with  higher  moment  coherent  risk  measures 


In  this  section  we  discuss  the  computational  framework  and  conduct  numerical  experiments  demonstrat¬ 
ing  the  computational  performance  of  the  proposed  BnB  algorithm  when  solving  the  risk-averse  maxi¬ 
mum  weighted  clique  problem  (7).  We  use  a  family  of  higher-moment  coherent  risk  (HMCR)  measures 
that  were  introduced  in  [19]  as  optimal  values  to  the  stochastic  programming  problem  of  the  form 

HMCRa,p(X)  =  min  q  +  (1  —  a)-1 1| {X  —  rj)+ 1|  ,  a  e  (0, 1),  p  >  1,  (11) 

where  X+  =  max{0,  X}  and  [|  X  \\p  =  ( E |  W | /J ) 1  .  The  HMCR  measures  are  nonlinear  measures  of 
risk  that  quantify  the  risk  of  loss  distribution  X  via  its  tail  moments,  and  are  particularly  suitable  for 
measuring  risk  in  heavy-tailed  data.  HMCR  measures  possess  a  number  of  important  properties,  such  as 
coherence,  isotonicity  with  respect  to  the  second-order  stochastic  dominance,  which  implies  consistency 
with  the  expected  utility  theory,  and  so  on.  A  popular  case  of  (11),  also  known  as  the  Conditional 
Value-at-Risk  (CVaR)  or  Expected  Shortfall  risk  measure,  arises  when  p  —  1: 

CVaRa(W)  =  min  p  +  (1  -  a)_1E(A  -?/)+,  cr  €  (0,  1).  n2) 

rieR  K  ’ 

Mathematical  programming  models  containing  HMCR  measures  in  the  objective  or  constraints  can  be 
formulated  using  /(-order  cone  constraints.  Traditionally  to  stochastic  programming,  the  set  of  random 
events  Q  is  considered  to  be  discrete,  O.  =  { to  i , . . . ,  ojn),  with  the  corresponding  probabilities  P  (co^)  = 
Tik  >  0,  such  that  jt\  +  •  •  •  +  tt,v  =  1.  Then,  the  mathematical  programming  formulation  (7)  with  risk 
measure  p(X )  selected  as  HMCR/;^(  A )  takes  the  form  of  a  mixed  integer  /;- order  cone  programming 
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(MIpOCP)  problem: 


min  r]  +  (1  —  a)  1t 
s- 1.  t  >\\{yi,...,yN)\\p 

*k1/Pyk  >  ^uixik  ~  r),  k  =  1 . /V 

i'eF 

J>  =  l  (13) 

;  e  V 

tit  <  Xi ,  i  e  F 

Xi  +  Xj  <1,  (i,  j)  e  E 

Xi  e  {0, 1},  Ui  >  0,  /  e  V;  >  0,  k  —  1, . . . ,  N, 

where  Xj k  is  the  realization  of  the  stochastic  weight  of  vertex  i  e  V  under  scenario  k,k  =  1 . N,  and 

the  scenario  probabilities  P ( X \  =  Xik, . . . ,  X\r  =  X ^ /c )  =  tt/c.  Similarly,  the  lower  bound  problem 
(10)  for  the  combinatorial  branch-and-bound  algorithm  described  in  the  previous  section  takes  the  form 

C(Q  U  Q+1)  =  min  r]  +  (l-a)~1t 

s.  t.  t>\\yi . yovllp 

^kl/pyk  -  Y2u,Xlk  -  rl'  k  =  l,...,N 

i€V 

y^Uj  =  \  (!4) 

ieV 

Ui  >0,  /  €  Q  U  Ci+ j 
m,-  =  0,  i  €  V  \  (Q  U  Q+ 1 ) 

JA:  >  0,  fc  =  1 . (V. 

In  cases  when  p  =  1  or  2,  problems  (13)  and  (14)  reduce  to  linear  programming  (LP)  and  second  order 
cone  programming  (SOCP)  models,  respectively.  Both  represent  well  established  subjects  in  optimiza¬ 
tion,  for  which  a  range  of  efficient  solvers  exist.  However,  no  efficient  long-step  self-dual  interior  point 
methods  exist  for  solving  /(-order  conic  constrained  problems  when  p  e  (1, 2)  U  (2,  oo)  due  to  the  fact 
that  the  p- cone  is  not  self-dual  in  this  case.  Below  we  discuss  solution  methods  based  on  polyhedral 
approximations  of  /(-order  cones  and  representation  of  rational-order  /(-cones  via  second  order  cones. 

Both  these  approaches  rely  on  “lifting”  a  /(-order  cone  into  a  higher  dimensional  space  by  representing 
it  as  an  intersection  of  a  (large)  number  of  three-dimensional  (3D)  cones. 

In  order  to  construct  a  polyhedral  approximation  of  /(-cone  t  >  ||  (y i . .yrv)||/?>  it  first  can  be  equiva¬ 

lently  represented  as  a  chain  of  3D  /(-cone  inequalities  of  the  form  [7,  32]: 


t  =  yiN-i,  yN+j  >  \\(yij-i,y2j)\\p,  j  =  l . N-  l.  (15) 


Then,  each  3D  p- cone  in  (15)  is  replaced  with  its  (outer)  gradient  polyhedral  approximation  in  the  form 
of  m  +  1  circumscribed  planes: 


cos 


p~'ev 


yN+j  :>  }'2j-\ 


(cos  pQv  +  shP’Ov) 


r  +  y  2j 


sm^ 


-l0v 


TCV 


(cos  PQV  +  sin^^v) 


6V  —  — ,  v  —  0, ...  ,m. 


2m 


(16) 


12 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


The  resulting  approximating  LP  problem  can  be  solved  by  an  efficient  cutting  plane  algorithm  that  admits 
generation  of  cutting  planes  in  a  constant  time  that  does  not  depend  on  the  accuracy  of  approximation 
[20,  32], 

Alternatively,  an  exact  solution  of  a  /;- order  cone  programming  problem  can  be  obtained  by  means  of 
reformulating  it  as  a  SOCP  problem  in  the  case  when  the  parameter  p  is  a  rational  number.  For  example, 
in  the  case  of  p  =  3,  the  p-order  cone  t  >  ||(yi, . . . ,  V at )  1 1 3  can  be  represented  via  2 N  rotated  3D 
quadratic  cones  [24] : 


t  =  Zl  +  ...  +  ZN,  yj  <tVj,  Vj<ZjVj ,  j  =  1 . N.  (17) 

Both  the  described  polyhedral  approximation  approach  and  SOCP  reformulation  approach  have  been 
employed  in  our  implementation  of  the  combinatorial  branch-and-bound  algorithm  of  Section  3.2  in  the 
cases  when  the  lower  bound  problem  (14)  is  nonlinear,  i.e.,  when  p  >  1. 

Specifically,  a  polyhedral  approximation  of  the  lower  bound  problem  (14)  was  solved  at  each  node  of  the 
BnB  tree  instead  of  the  exact  the  nonlinear  problem  (14)  itself.  This  allows  for  a  significant  reduction  in 
the  computational  cost  of  the  BnB  method,  since  the  warm-start  capabilities  of  LP  simplex  solvers  can 
be  utilized  during  repeated  solving  of  the  approximating  LP  problem. 

The  exact  solution  method  that  is  based  on  the  SOCP  reformulation  is  employed  for  solving  (14)  once 
an  incumbent  solution  is  found,  and  the  corresponding  optimal  value  is  used  to  update  the  global  upper 
bound  L* .  Due  to  the  fact  that  the  described  polyhedral  approximation  is  an  outer  approximation,  one 
has 


£lp(2  U  Q+1)  <  C(Q  U  Q+1),  (18) 

where  £lp(£?  U  Q+i)  is  the  optimal  value  given  by  the  polyhedral  (LP)  approximation  of  the  lower 
bound  problem.  This  implies  that  for  any  Q  U  Cp+\  containing  an  incumbent  solution  Q*.  the  following 
holds 

ClAQ  u  Q,+1)  <  CuAQ*)  <  C(Q*)  =  L* , 

which  guarantees  the  correctness  of  the  BnB  algorithm  relying  on  polyhedral  approximations.  Note, 
however,  that  inequality  (18)  also  implies  that  the  use  of  polyhedral  approximations  instead  of  the  exact 
nonlinear  formulation  of  the  lower  bound  problem  (14)  allows  for  delayed  pruning  of  “non-promising” 
branches  of  the  BnB  tree  in  situations  when 

£lp(<2  U  Q+1)  <  L*  <  C{Q  U  Q+1). 

Still,  in  our  experience,  the  computational  savings  due  to  the  use  of  polyhedral  approximations  during 
the  BnB  procedure  greatly  outweigh  the  costs  of  possible  delayed  pruning. 

Note  also  that  in  the  special  case  of  p  —  1,  when  p(X )  =  CVaRff(A ),  the  lower  bound  problem  (14)  is 
an  LP  problem  and  thus  requires  no  polyhedral  approximation  or  SOCP  reformulation. 

4.1  Setup  of  the  numerical  experiments  and  results 

The  numerical  studies  of  the  risk-averse  maximum  weighted  clique  problem  were  conducted  on  ran¬ 
domly  generated  Erdos-Renyi  graphs  [13]  of  orders  \  V\  =  50,  100, 150,  200  and  average  densities  d  = 
0.2, 0.5,  and  0.8.  The  stochastic  weights  of  graphs’  vertices  were  generated  as  i.i.d.  samples  from  the 
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uniform  £/[0,  1]  distribution.  In  particular,  we  generated  scenario  sets  with  N  =  50,  100, 200,  500,  1000 
scenarios  for  each  combination  of  graph  order  and  density.  The  risk  measure  p  has  been  selected  as  an 
HMCR  measure  (11)  with  p  —  1,2,3  and  a  —  0.9. 

The  combinatorial  branch-and-bound  algorithm  of  Section  3.1  with  the  additional  specializations  de¬ 
scribed  above  has  been  coded  in  C++,  and  we  used  the  CPLEX  Simplex  and  Barrier  solvers  for  solving 
the  polyhedral  approximations  and  SOCP  reformulations  of  the  /(-order  cone  programming  lower  bound 
problem  (14),  respectively.  In  the  case  of  p  =  1,  the  CPLEX  Simplex  solver  was  used  to  solve  the  lower 
bound  problem  directly. 

The  performance  of  the  developed  BnB  method  was  compared  with  that  of  the  mathematical  program¬ 
ming  formulation  (13)  of  the  risk-averse  maximum  weighted  clique  problem.  The  MIpOCP  problem 
(13)  was  solved  with  CPLEX  MIP  solver  in  the  case  of  p  —  1,  and  CPLEX  MIP  Barrier  solver  was 
applied  to  the  SOCP  version  of  (13)  in  the  case  of  p  —  2  or  SOCP  reformulation  of  (13)  in  the  case  of 

P  =  3. 

The  computations  were  ran  on  an  Intel  Xeon  3.30GHz  PC  with  128GB  RAM,  and  version  12.5  of  the 
CPLEX  solver  in  Windows  7  64-bit  environment  was  used. 

Table  1  summarizes  the  computational  times,  averaged  over  five  instances,  corresponding  to  the  afore¬ 
mentioned  problem  configurations  with  a  fixed  number  of  scenarios  of  /V  =  100.  Observe  that  the  BnB 
algorithm  provides  one  to  two  orders  of  magnitude  advantage  in  running  time  over  the  CPLEX  MIP 
solver  for  all  configurations,  except  that  of  p  —  1  and  cl  =  0.8.  For  the  consecutive  set  of  experiments, 
Table  2  demonstrates  the  effect  of  variations  in  the  scenario  size  N  for  different  graph  orders  and  values 
of  p  while  maintaining  a  constant  average  graph  density  of  d  —  0.5.  The  specified  edge  probability 
was  chosen  due  to  the  fact  that  the  size  of  the  mathematical  programming  (13)  formulation  is  density 
dependent.  Mainly,  the  number  of  structural  constraints  x,  +  xj  <  1 ,(/,/')  e  E  in  (13)  increases  as 
d  decreases.  The  opposite  relationship  holds  true  for  the  BnB  algorithm,  as  the  search  space  expands 
with  the  number  of  edges.  Thus,  a  “fair”  comparison  between  the  two  solution  methods  can  be  made  on 
graphs  with  density  d  =  0.5. 

It  follows  from  Tables  1  and  2  that  the  computational  advantages  of  the  combinatorial  BnB  algorithm 
over  the  direct  solution  approach  become  more  pronounced  (up  to  two  orders  of  magnitude)  with  increase 
in  p,  i.e.,  as  full  formulation  (13)  and  the  lower  bound  problem  (14)  become  more  difficult.  Also  of 
interest  is  the  fact  that  the  BnB  method  often  yields  better  solution  times  for  problems  with  p  —  3  than 
p  —  2.  This  is  a  consequence  of  a  known  property  of  the  employed  cutting-plane  algorithm  for  solving 
polyhedral  approximations  of  /(-order  cone  programming  problems,  which  becomes  more  effective  as  p 
increases  [20]. 


5  Conclusions 

In  this  study,  we  have  considered  a  class  R-MWS  problems  which  entail  finding  a  network  subgraph 
of  minimum  risk  satisfying  some  hereditary  structural  property.  We  employ  the  HMCR  measures  as  a 
rigorous  framework  for  quantifying  the  distributional  information  of  the  stochastic  vertex  weights.  By 
means  of  diversification  properties  of  the  introduced  optimization-based  risk  function  for  measuring  risk 
of  subgraphs,  it  was  shown  that  the  inclusion  of  additional  vertices  in  a  partial  solution  promotes  the  min¬ 
imization  of  risk;  hence,  optimal  solutions  to  R-MWS  problems  arc  maximal  subgraphs.  A  combinatorial 
branch-and-bound  algorithm  utilizing  the  risk-  and  graph-related  aspects  of  the  problem  structure  was  de- 
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d 

=  0.2 

d 

=  0.5 

d  = 

0.8 

p 

\y\ 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

1 

50 

0.08 

1.10 

0.37 

1.31 

3.04 

1.90 

100 

0.24 

6.43 

4.02 

28.06 

206.46 

121.27 

150 

0.74 

38.37 

26.86 

220.17 

4065.16 

2434.66 

200 

1.67 

118.13 

73.73 

1074.93 

— 

— 

2 

50 

0.40 

18.54 

1.66 

45.67 

14.50 

156.26 

100 

1.38 

110.67 

19.37 

412.90 

956.93 

2555.77 

150 

3.37 

629.38 

124.99 

2293.96 

6154.76 

— 

200 

3.68 

2822.38 

166.44 

— 

— 

— 

3 

50 

1.35 

54.58 

2.38 

91.98 

14.15 

273.10 

100 

2.43 

215.97 

17.66 

625.52 

716.22 

4644.90 

150 

4.41 

927.03 

102.28 

3560.27 

— 

— 

200 

7.24 

3031.77 

412.74 

— 

— 

— 

Table  1:  Average  computation  times*  (in  seconds)  obtained  by  solving  problem  (13)  using  the  proposed  BnB  algorithm  and 
CPLEX  with  risk  measure  (11)  and  scenarios  N  =  100.  All  running  times  are  averaged  over  5  instances  and  symbol  “ — ■” 
indicates  that  the  time  limit  of  7200  seconds  was  exceeded. 


w\ 

=  50 

\v\ 

=  100 

\v\ 

=  150 

i  y\ 

=  200 

P 

N 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

1 

50 

0.19 

1.15 

1.40 

11.88 

4.43 

43.49 

13.09 

130.45 

100 

0.37 

1.31 

4.02 

28.06 

26.86 

220.17 

73.73 

1074.93 

200 

0.87 

3.01 

14.64 

71.93 

84.83 

443.74 

329.76 

2550.12 

500 

4.70 

10.40 

72.90 

219.40 

429.80 

1794.60 

2118.60 

— 

1000 

14.87 

28.82 

259.48 

702.97 

1909.48 

6094.66 

— 

— 

2 

50 

0.80 

22.96 

4.10 

167.30 

12.89 

961.32 

37.67 

3668.54 

100 

1.66 

45.67 

19.37 

412.90 

124.99 

2293.96 

166.44 

— 

200 

6.57 

109.72 

131.44 

907.95 

797.04 

5961.69 

900.50 

— 

500 

61.10 

552.80 

970.10 

— 

3221.70 

— 

— 

— 

1000 

194.59 

965.69 

3669.37 

— 

— 

— 

— 

— 

3 

50 

1.22 

34.85 

3.96 

245.79 

11.99 

1040.01 

34.30 

3847.40 

100 

2.38 

91.98 

17.66 

625.52 

102.28 

3560.27 

412.74 

— 

200 

5.21 

261.83 

60.59 

2388.44 

333.61 

— 

1424.27 

— 

500 

20.10 

1299.60 

248.70 

— 

1751.90 

— 

— 

— 

1000 

58.00 

3277.93 

768.53 

— 

5634.04 

— 

— 

— 

Table  2:  Average  computation  times  (in  seconds)  obtained  by  solving  problem  (13)  using  the  proposed  BnB  algorithm  and 
CPLEX  with  risk  measure  (11)  and  edge  density  d  =  0.5.  All  running  times  are  averaged  over  5  instances  and  symbol  “ — ■” 
indicates  that  the  time  limit  of  7200  seconds  was  exceeded. 


veloped  and  tested  on  a  special  case  of  the  risk-averse  maximal  clique  problem.  Numerical  experiments 
on  randomly  generated  Erdos-Renyi  graphs  demonstrate  that  the  proposed  algorithm  may  significantly 
reduce  solution  times  relative  to  an  equivalent  mathematical  programming  counterpart.  Notably,  im¬ 
provements  were  observed  for  all  the  tested  graph  configurations  when  using  the  HMCR  measures  with 
p  —  2,3,  and  for  graphs  with  edge  probabilities  of  less  than  0.8  when  using  an  HMCR  measure  with 
P  =  1. 
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Abstract  We  discuss  two  families  of  valid  inequalities  for  linear  mixed  integer  program¬ 
ming  problems  with  cone  constraints  of  arbitrary  order,  which  arise  in  the  context  of  stochas¬ 
tic  optimization  with  downside  risk  measures.  In  particular,  we  extend  the  results  of  Atamtiirk 
and  Narayanan  (Math.  Program.,  2010,  2011),  who  developed  mixed  integer  rounding  cuts 
and  lifted  cuts  for  mixed  integer  programming  problems  with  second  order  cone  constraints. 
Numerical  experiments  conducted  on  randomly  generated  problems  and  portfolio  optimiza¬ 
tion  problems  with  historical  data  demonstrate  the  effectiveness  of  the  proposed  methods. 

Keywords  valid  inequalities  •  nonlinear  cuts  •  mixed  integer  p-order  cone  programming  • 
stochastic  optimization  ■  risk  measures 
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1  Introduction 

In  this  work  we  consider  mixed  integer  programming  problems  with  linear  objective  and 
p-order  cone  constraints,  which  represent  an  extension  of  mixed  integer  second  order  cone 
programming  (MISOCP)  problems  and  subsequently  are  referred  to  as  mixed  integer  p- 
order  cone  programmig  (MIpOCP)  problems.  Specifically,  we  focus  on  a  class  of  MIpOCP 
instances  that  arise  in  stochastic  optimization  problems  with  risk-based  objective  functions 
or  constraints. 

There  exists  a  substantial  literature  on  solution  approaches  for  mixed  integer  conic  pro¬ 
gramming  problems.  In  many  cases,  the  proposed  methods  attempt  to  extend  some  of  the 
techniques  developed  for  mixed  integer  linear  programming.  One  of  such  research  directions 
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concerns  construction  of  branch-and-bound  schemes  based  on  outer  polyhedral  approxima¬ 
tions  of  cones.  This  potentially  allows  for  computational  savings  in  traversing  the  branch- 
and-bound  tree  due  to  the  “warm  start”  capabilities  of  linear  programming  solvers.  In  par¬ 
ticular,  Vielma  et  al.  [1]  proposed  a  branch-and-bound  method  for  MISOCP  that  employed 
lifted  polyhedral  approximations  of  second  order  cones  due  to  Ben-Tal  and  Nemirovski 
[2],  Vinel  and  Krokhmal  [3]  discuss  further  development  of  this  approach  in  the  case  of 
MIpOCP.  Drewes  [4]  presented  subgradient-based  linear  outer  approximations  for  the  sec¬ 
ond  order  cone  constraints  in  mixed  integer  programs.  With  respect  to  mixed  integer  nonlin¬ 
ear  programming,  a  similar  idea  has  been  exploited  by  Bonami  et  al.  [5]  and  Tawarmalani 
and  Sahinidis  [6]. 

Two  approaches  to  generation  of  valid  inequalities  for  MISOCP  problems  have  been 
proposed  by  Atamtiirk  and  Narayanan  [7,8].  In  the  first  paper  the  authors  introduced  a  re¬ 
formulation  of  a  second  order  cone  constraint  using  a  set  of  two-dimensional  second  order 
cones  and  then  derived  valid  inequalities  for  the  resulting  mixed  integer  sets.  The  obtained 
cuts  were  termed  by  the  authors  conic  mixed  integer  rounding  cuts.  In  [8],  a  general  lifting 
procedure  for  deriving  nonlinear  conic  valid  inequalities  was  proposed  and  applied  to  0-1 
MISOCP  problems. 

In  a  recent  work  of  Belotti  et  al.  [9],  disjunctive  conic  cuts  for  MISOCP  problems  are 
introduced.  For  the  case  of  general  convex  sets,  the  authors  are  able  to  describe  the  convex 
hull  of  the  intersection  of  a  convex  set  and  a  linear  disjunction.  And  in  the  particular  case  of 
the  feasible  set  of  the  continuous  relaxation  of  a  MISOCP  problem  they  derive  a  closed-form 
expression  for  such  a  convex  hull,  thus  obtaining  a  new  nonlinear  conic  cut. 

Among  other  approaches  to  solving  mixed  integer  cone  programming  problems  one 
can  mention  the  split  closure  of  a  strictly  convex  body  [10],  lift-and-project  algorithm  [11], 
Chvatal-Gomory  and  disjunctive  cuts  for  0-1  conic  programming  [12]. 

It  is  worth  noting  that  the  vast  majority  of  the  existing  literature  on  mixed  integer  cone 
programming  problems  addresses  the  case  of  self-dual  cones,  and  particularly  second  order 
cones,  with  relatively  little  attention  paid  to  problems  involving  cones  that  are  not  self-dual, 
as  in  the  case  of  MIpOCP  with  p  e]l,2[U]2,°°[.  In  this  work,  we  consider  derivation  of 
valid  inequalities  for  mixed  integer  problems  with  p-order  cone  constraints  following  the 
techniques  [7, 8]  proposed  for  MISOCP.  We  derive  closed  form  expressions  for  two  families 
of  valid  inequalities  for  MIpOCP  problems:  mixed  integer  rounding  conic  cuts  and  lifted 
conic  cuts.  We  also  propose  to  use  outer  polyhedral  approximations  as  a  practical  way  of 
employing  nonlinear  lifted  cuts  within  branch-and-cut  framework.  With  such  an  approach, 
we  are  able  to  obtain  promising  computational  results  on  a  number  of  portfolio  optimization 
problems  with  real-life  data. 

The  paper  is  organized  as  follows.  In  Section  2  we  present  mixed  integer  rounding  cuts 
for  p-cone  constrained  mixed  integer  sets.  Section  3  discusses  (nonlinear)  lifted  cuts  for 
0-1  and  mixed  integer  p-order  cone  programming  problems.  Computational  studies  of  the 
developed  techniques  on  randomly  generated  MIpOCP  problems  as  well  as  portfolio  opti¬ 
mization  problems  with  real-life  data  are  discussed  in  Section  4,  followed  by  concluding 
remarks  in  Section  5. 


2  Conic  Mixed  Integer  Rounding  Cuts  for  p-Order  Cones 

In  this  section  we  present  a  class  of  mixed  integer  rounding  cuts  for  MIpOCP  problems 
arising  in  the  context  of  risk-averse  stochastic  optimization.  A  mixed  integer  p-order  cone 
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programming  problem  has  the  form 
min  (cjx  +  cjy) 
s.  t.  D,x  +  Dvy  <  cl 

\\\jX+Gjy-bj\\Pj  <eJ\  +  f]y^jhj,  j=l,...,k 
xeZn+,yeRl, 

where  pj  e]l,°°[,  and  ||  •  Up  is  the  usual  p-norm  in  the  Euclidean  space  of  an  appropriate 
dimension:  ||r||p  =  (|ri  ^  +  . . .  +  \r^\p)xlp . 

MIpOCP  problems  (1)  can  be  obtained  from  stochastic  programming  models  that  in¬ 
volve  specific  families  of  risk  measures  in  objectives  or  constraints.  Namely,  given  a  prob¬ 
ability  space  let  the  cost  or  loss  function  F  be  an  element  of  the  linear  space 

, p)  of  -measurable  functions  Y  :  Q  H >  R,  where  p  >  1.  Then,  the  higher- 
moment  coherent  risk  measures  HMCRp^jF)  are  defined  as  the  optimal  values  of  the  fol¬ 
lowing  convex  stochastic  optimization  problem  [13] 

HMCR;;,«(F)=min{r]  +  (l-a)-1||[F-r]]+||/,},  ae]0,l[,  p>l,  (2) 

where  [F]+  =  max{0,F}  and  1 1 F 1 1 =  (E|F|P)1/T\  A  related  family  of  semi-moment  coherent 
risk  measures,  or  risk  measures  of  semi-Jzfj,  type  [14],  is  given  as 

SMCR^(F)  =  EF  +  JS||[F-EF]+||/;,  j3  €  [0, 1],  p  >  1.  (3) 

In  the  case  when  the  set  Q  is  finite,  Q  =  {©i, , , . ,  COm},  and  the  cost  function  F  =  F(u,  G)) 
is  a  piecewise  linear  convex  function  of  the  decision  vector  u,  terms  with  HMCR  or  SMCR 
measures  in  the  objective  function  and/or  constraints  can  be  implemented  via  linear  inequal¬ 
ities  involving  F (u, CO,)  and  p-order  cone  constraints  t  >  ||(wi, . . .  ,wm)|jp,  thus  leading  to 
MIpOCP  problem  of  the  form 

min  (cjx  +  cjy) 
s.  t.  Dvx  +  Dvy<d 

(4) 

II  [Ayx  +  Gyy  —  byj+H^.  <  ejx  +  fjy-fy,  j=l,...,k 
xeZn+,yeRq+, 

Formulation  (4)  differs  from  (1)  by  the  presence  of  operator  [•]+,  which  explicitly  accounts 
for  the  problem  structure  induced  by  downside  risk  measures  such  as  (2)-(3).  For  simplicity, 
we  consider  the  case  of  a  single  p-cone  constraint  in  (4),  k  =  1 .  Following  the  approach  of 
[7]  for  constructing  mixed  integer  rounding  cuts  for  problems  of  type  (1)  with  p  =  2,  we 
rewrite  the  p-cone  constraint  in  (4)  as 

to  <  eTx  +  fTy  —  h 

U  >  [a-"x  +  g/y -bi\+,  i= 

^0  >  ||  {h  ?  •  •  •  Jm)  \\pi 

where  a,  and  g;  denote  the  i-th  rows  of  matrices  A  and  G,  respectively.  Then,  the  task  of 
deriving  valid  inequalities  for  the  original  p-cone  mixed  integer  set  in  (4)  can  be  reduced  to 
obtaining  valid  inequalities  for  the  polyhedral  mixed  integer  set 

T  =  {xezn+,  yeR^,  teIR  :  [aTx  +  gTy  — 2>]+  <  t}, 
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or,  without  loss  of  generality,  the  set 

T  =  {(y+,y~,t,x)  eM.3+x  Z'j.  :  [aTx  +  y+  -y~  -  b\+  <  t).  (5) 

The  following  two  propositions  provide  an  expression  for  a  family  of  such  inequalities. 
Proposition  2.1  For  a  /  0,  the  inequality 


7=1 


t+y- 


a 


(6) 


w/rcre  /a  =  i^|  —  [jaf  J  and 

(!-/)«>  n<  a  <n  +  f 

(1  —  /)«+  (a*-. n)  —  /,  n  +  f  <  a  <  n+  1 

7.V  valid  for  T. 

Proposition  2.2  Inequalities  (6)  with  a  =  aj,  j  =  1, . . .  ,n,  are  sufficient  to  cut  off  all  frac¬ 
tional  extreme  points  of  the  relaxation  ofT. 


Proofs  of  Propositions  2.1  and  2.2  are  furnished  in  the  Appendix.  It  is  worth  noting, 
however,  that  since  (5)  is  a  polyhedral  mixed  integer  set,  the  derived  valid  inequalities  can 
also  be  obtained  using  the  general  theory  of  mixed  integer  rounding  (MIR)  inequalities;  see, 
for  example,  [15].  An  advantage  of  the  direct  derivation  is  that  it  provides  a  natural  way 
of  dealing  with  continuous  variables  y+,y“,r.  Propositions  2.1  and  2.2  justify  the  usage  of 
inequalities  of  type  (6)  as  cuts  in  a  branch-and-cut  procedure;  following  [7],  we  refer  to 
these  inequalities  as  conic  MIR  cuts.  The  results  of  numerical  experiments  on  utilization  of 
conic  MIR  cuts  (6)  in  MIpOCP  problems  are  presented  in  Section  4. 


3  Lifted  Conic  Cuts  for  p -Order  Cones 

3.1  General  Framework 

Lifting  for  conic  mixed  integer  programming  was  studied  in  [8],  where  a  general  approach 
for  constructing  valid  nonlinear  conic  inequalities  for  mixed  inter  conic  programming  prob¬ 
lems  was  proposed.  Namely,  consider  a  general  mixed  integer  conic  set 

S"(b)  =  | (x°, . . . ,x")  eX°  x  ■■■  xXn  :  b  f^A'x'  etf  j.  (7) 

where  A'  6  Wnxn‘,  b  G  R'",  W  is  a  proper  cone  (a  closed,  convex,  pointed  cone  with  a 
nonempty  interior),  and  each  X  '  C  R"'  is  a  mixed  integer  set.  Similarly,  S°(b), ...  .S"  1  (b) 
are  restrictions  of  the  set  S"(b).  Further,  it  is  assumed  that  the  following  conic  inequality 

h-F°x°e  JT, 

where  3?  is  a  proper  cone,  is  known  to  be  valid  for  the  restriction  S°(b).  The  approach 
proposed  in  [8]  is  to  iteratively  find  a  sequence  F1 , . . .  ,F",  such  that 

i 

h  —  £  fV  e  X  (8) 

7=0 
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is  valid  for  the  respective  restriction  S'(b)  for  all  i.  Such  a  procedure  is  called  lifting  and 
the  resulting  inequality  that  is  valid  for  the  initial  mixed  integer  set  S"(b)  is  called  lifted 
inequality.  In  order  to  determine  the  values  of  F1,.  ..,F",  the  lifting  set  is  introduced  for 
v  £  Mm  as 

4 >,-(v)  =  jd  £  Rs  :  h  -  £  FV  -  d  £  JT  for  all  (x°, . . .  ,x')T  G  S\ b  -  v) 
l  y=0 

Then,  a  necessary  and  sufficient  condition  for  (8)  to  be  valid  can  be  formulated,  which 
essentially  provides  a  description  of  the  set  of  valid  inequalities. 

Proposition  3.1  [8]  Inequality  (8)  is  valid  for  S'(b)  if  and  only  if  F't  €  <f>,(A'tj  for  all 
t  £  X'  and  i  =  0, . . . ,  n. 

The  condition  established  by  Proposition  3.1  is  still  too  general  to  be  used  for  derivation 
of  conic  cuts.  For  example,  it  can  be  seen  that  in  this  way  the  resulting  inequalities  are 
sequence-dependent,  i.e.,  a  change  in  the  order  in  which  variables  x'  are  introduced  will 
change  the  sets  <2>,  (v).  The  following  theorem  provides  a  “sequence-independent”  approach 
to  construction  of  lifting  procedure. 

Theorem  3.1  [8]  //T(v)  C  4>q  (y)  for  all  v  £  Rm  and  T  is  superadditive,  then  (8)  is  a  lifted 
valid  inequality  for  Sn  (b)  whenever  F't  £  T  (A't)  for  all  t  £  X'  and  i  =  0, . . . ,  n. 

Then,  the  following  procedure  can  be  formulated  for  derivation  of  lifted  conic  inequalities: 

Step  1.  Compute  ‘f’ofv). 

Step  2.  If  4>0(v)  is  not  superadditive,  find  a  superadditive  T(v)  C  <J’o(v). 

Step  3.  For  each  i  find  F'  such  that  F't  £  Y (A't)  is  satisfied  for  all  t  £  X' . 

In  [8]  this  process  was  employed  to  obtain  nonlinear  lifted  conic  cuts  for  0-1  MISOCP 
problems;  however,  no  computational  results  were  reported.  Below  we  apply  this  procedure 
to  derive  nonlinear  lifted  conic  cuts  for  0- 1  and  mixed  integer  77-order  cone  programming 
problems  with  risk-based  constraints,  and  also  discuss  polyhedral  approximations  of  these 
cuts  that  are  used  in  numerical  implementation. 


3.2  Lifting  Procedure  for  0-1  77-Order  Cone  Programming  Problems 

In  the  case  of  0-1  77-order  cone  programming  problem,  consider  the  following  conic  set 

Snp(b)  =  |  (x,  t]+,  t7_,y,r)  £  {0,1}"xR4+  :  [  £  a*  +  ??+  -  ??„  +  +yp  <  ?pj, 

where  77  £]1,°°[.  The  set  Snp[b)  represents  a  relaxation  of  a  high  dimensional  0-1  mixed 
integer  77-order  conic  set:  all  but  one  dimensions  of  the  77-cone  are  aggregated  into  the  term 
yp .  By  complementing  the  binary  variables,  if  necessary,  we  can  assume  that  all  a,-  >  0.  The 
restriction  S°p  of  this  set  can  be  taken  as 

S>)  =  {(x,y,t)£{0,l}xRi  :  [x-b}p++yP<tp}. 

Notice  that  S(p(b)  has  one  extreme  point  (b,0,0),  which  is  fractional  when  b  £]0, 1[.  Thus,  in 
the  only  interesting  case  we  have  1_Z?J  =  0.  Using  the  results  of  the  previous  section,  the  initial 
valid  inequality  can  be  selected  as  I ( 1  —  /) (x  —  [b\)\p  +yp  <tp,  where /  =  b—  [ij  ( the  fact 
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that  this  inequality  is  valid  can  be  verified  directly  by  examining  the  possible  values  of  x,y,  t). 
Now,  by  definition,  in  order  to  compute  <Pq  (v)  we  need  to  find  such  d  that  inequality 

\{l-f)(x-[b])+d\p+yp<tp  (9) 

is  satisfied  for  all  x.y,t  such  that  [x—  b  +  v]p+  +yp  <  tp. 

Recalling  that  \b\  =  0  and,  therefore,  f  =  b,  we  obtain  that  (9)  can  be  rewritten  as 
|(1  —  b)x  +  d\p  +yp  <  tp  for  all  x.y.t  such  that  [x  —  b  +  v]p+  +yp  <tp.  Given  that  x  6  (0, 1}, 
for  x  =  0  we  have  \d\  <  [v  —  b\+,  and  for  x  =  1  we  have  1 1  —  b  +  d\  <  [1  —  ft  +  v]+.  Thus, 
if  v  >  b  then  \d\  <  v  —  b,  and  if  v  <  b  then  d  =  0,  meaning  that  \d\  <  [v  —  b\+,  whereby 
<J>o(v)  =  {d  :  \d\  <  [v  —  Z?]+},  which  is  superaddive.  Finally,  the  following  proposition  holds. 

Proposition  3.2  Conic  inequality 


(i  -/)(*-  IAI )  +  Yj  aJ' +yP  - fP 

i=  1 


GO) 


with  dCj  =  [a,  —  b\+  is  valid  for  the  set  Snp(b). 

Proof  Since  <P o(v)  is  superadditive,  by  Theorem  3.1  we  only  need  to  verify  that  the  chosen 
values  of  a,-  satisfy  a,x  6  <Po(ajx)  forx  G  (0, 1},  which  follows  readily  from  the  expression 
for  <2>o(v)-  D 


3.3  Lifting  Procedure  for  MIpOCP  Problems 

Similarly,  in  the  case  of  MIpOCP  problem  we  consider  the  set 

^P(b)  =  j(x>77+>77->)M)  £2"xl4+  :  [^a,x,-  +  77+  -  ??_  -  +  +yp  <  fp|, 

where  p  e]i,»[.  Once  again,  the  set  Snp(b)  represents  a  relaxation  of  a  high  dimensional 
mixed  integer  p-order  cone  constraint.  Let  us  also  assume  that  values  x,  are  bounded,  e.g., 
Xi  £  (0, for  all  i.  Again,  let  us  assume  without  loss  of  generality  that  a,  >  0.  The 
restriction  of  Snp(b)  can  be  selected  as 

5°(£)  =  {(x,y,0eZ+x]R2+  :  [x- b]p+  +  yp  <  tp},  (11) 

but  in  this  case  let  us  choose  a  weaker  initial  valid  inequality,  [(1  —  f)(x—  \b\ )] P  +yp  <  tp . 
The  problem  of  computing  <J>o(v)  is  then  reduced  to  the  problem  of  finding  values  of  d  such 
that 


[(!-/)*-  [b\(l~f)+d\+<  [x-b  +  v\+.  (12) 

Recall  that  we  are  only  interested  in  a  superadditive  subset  T  (v)  of  such  set.  One  of  the 
possible  choices  is  T(v)  =  {d  >  0  :  d  <  [v  —  b  +  [b\(  \  —/)]+}•  Indeed,  0  6  T(v)  by 
definition,  and  (12)  is  a  consequence  of  inequality  ( 1  —  f)x  —  \b\  ( 1  —  f)  +d  <x  —  b  +  v, 
which  yields  the  above  expression  for  T(v).  Lastly,  the  following  proposition  holds. 
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Proposition  3.3  Conic  inequality 

[(!-/)(*-  lb\)+YjCCiX‘\P  +yp -fP  03) 

L  1=1  J  + 

with  a,j  =  |^— - ^ 's  valid  for  S"(b). 

Proof  Indeed,  in  accordance  to  Section  3.1  it  suffices  to  show  that  for  such  a  choice  of  a,- 
we  have  a,x  G  T  (a,x)  for  all  x.  For  .t  /  0  we  have 

Y(aix)  =  {d>0  :  d  <  [a,- x—  b+  \b\  (1  —/)]+}, 

and 

a,x  =  — — +  — —  x<[ai-b+[b\(l-f)]+<[aiX-b+[b\(\-f)]  +  . 

On  the  other  hand,  for  x  =  0  it  is  clear  that  Oe  1(0).  □ 


3.4  Polyhedral  Approximations  of  p -Order  Cones 


Observe  that  lifted  cuts  (10)  and  (13)  for,  respectively,  0-1  and  mixed  integer  p-order  cone 
programming  problems  have  the  form  of  p-order  cones  themselves.  Thus,  one  may  expect 
that  while  addition  of  such  cuts  can  reduce  the  number  of  nodes  explored  in  the  branch- 
and-bound  tree,  the  computational  cost  of  solving  the  relaxed  problem  with  extra  p-cone 
constraints  at  the  nodes  may  increase.  In  view  of  this,  we  propose  to  replace  the  nonlinear 
p-order  cone  cuts  (10)  and  ( 1 3)  with  their  polyhedral  approximations  during  the  branch-and- 
cut  procedure.  A  detailed  discussion  of  polyhedral  approximations  of  p-order  cones  can  be 
found  in  [3], 

Since  in  our  case  the  lifted  cuts  have  the  form  of  3-dimensional  p-cones,  we  use  a  simple 
gradient  polyhedral  approximation.  Particularly,  a  gradient  polyhedral  approximation  for  the 
conic  set  J^(3)  =  {£,  <E  R+  :  £3  >  ||(^i,fe)||p},  P  e]l  ,°°[,  can  be  constructed  as 

■*$)  =  {*eR+  :  &>ajp)Zi+PiP)&,  >  =  0 . €},  (14) 

where 


=  (cos'7  6,  +  sin7’  9j)  p 


I -2  COS' 


iin^-1  e, 


9i=~,  i  0 . L 


Here  is  an  approximation  of  JCp^  in  the  sense  that  £,  e  jtp(3)  implies  £,  G  and 

(3) 

Ze-K?  implies  (1  +e)^3  >  ||(^i,^2)||/j.  where  e  =  e(£)  is  the  accuracy  of  approximation. 
In  the  case  of  polyhedral  approximation  (14),  the  latter  can  be  estimated  as  [16] 


m 


Up- !)(i)  >  [2)°°[- 


For  example,  for  p  =  4.0  it  suffices  to  have  £  =  25  facets  in  the  approximation  to  ensure  an 
accuracy  of  10~3. 
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4  Computational  Results 

In  this  section  we  report  the  results  of  numerical  experiments  on  applying  the  derived  MIR 
and  lifted  conic  cuts  to  MIpOCP  problem  instances.  In  our  case  study,  three  types  of  problem 
instances  were  considered:  the  first  type  represents  the  “generic”  MIpOCP  instances  with 
randomly  generated  data,  and  the  second  and  third  types  of  instances  represent  portfolio 
optimization  problems  with  cardinality  constraints  and  lot-buying  constraints,  respectively. 
Historical  financial  data  were  used  for  both  types  of  portfolio  optimization  problems.  A 
detailed  description  of  each  problem  type  is  given  below. 

Computations  were  ran  on  a  3GHz  PC  with  4GB  RAM,  and  CPLEX  12.2  solver  was 
used.  Since  CPLEX  cannot  natively  handle  p-cone  constraints  with  p  2,  a  second-order 
cone  reformulation  [17-19]  was  applied  to  p-order  cone  constraints  with  rational  p  >  2. 
The  derived  cuts  were  added  at  the  root  node  of  the  branch-and-bound  tree  using  CPLEX 
callback  routines.  In  addition,  each  instance  was  solved  using  the  default  mixed  integer 
CPLEX  solver  with  built-in  cuts.  In  both  cases,  default  solver  configuration  was  used,  with 
the  exceptions  that  the  number  of  threads  was  limited  to  one  and  QCP  relaxations  of  the 
model  were  used  at  each  node. 


4.1  Problem  Formulations 

Randomly  generated  MIpOCP  problems  The  first  set  of  problem  instances  consisted  of  ran¬ 
domly  generated  mixed  integer  p-order  cone  programming  problems  of  the  general  form. 
Specifically,  the  following  formulation  was  used: 

min  (cTx  +  y+  +  y~ ) 

s.t.  ||[Ax+y+l— y“l  — b]+||p  <  eTx  +  fy+^.gy~ -h  (15) 

xGZ",y+,y  6K+, 

where  A  G  R"xm,  c,b,e  G  R",  f,g,h  G  R,  and  1  =  (1, _ 1)T.  Each  of  the  parameters 

A,b,c,e,/,g,/i  in  (15)  was  selected  from  the  uniform  U(  1, 1000)  distribution. 


Portfolio  optimization  with  cardinality  constraints.  The  second  set  of  problem  instances 
consisted  of  portfolio  optimization  problems  with  cardinality  constraints.  Specifically,  port¬ 
folio  risk  as  given  by  HMCR  measure  was  minimized  while  requiring  that  the  portfolio’s 
expected  return  was  not  below  some  prescribed  level  i-q.  No  short  sales  were  allowed,  and 
the  cardinality  constraint  ensured  that  the  portfolio  was  comprised  of  no  more  than  K  assets: 

min  {  HMCR„,p(-r  1  y)  :  E(rTy)  >  r0,  lTy<l,  y<x,  lTx<tf),  (16) 

yeR" ,  xe{0,l}"  t  i  ' 

where  vectors  y  and  r  =  r(co)  represented  the  weights  of  assets  in  the  portfolio  and  the  as¬ 
sets'  uncertain  returns,  respectively.  Using  definition  (2)  of  HMCR  measures  and  assuming 
that  the  stochastic  vector  r(tt>)  is  discretely  distributed  with  m  scenarios  r (©/),  i  =  1, . . .  ,m, 
the  portfolio  optimization  problem  (16)  can  be  formulated  as  a  0-1  MIpOCP  problem  with 
( m+  1) -dimensional  p-cone  constraint.  In  our  computations  we  set  K  =  5  and  a  =  0.9  in 
(16). 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


On  Valid  Inequalities  for  Mixed  Integer  /j- Order  Cone  Programming 


9 


Portfolio  optimization  with  lot-buying  constraints.  The  last  type  of  problems  considered  in 
this  case  study  represents  portfolio  optimization  problems  with  lot-buying  constraints.  The 
lot-buying  constraints  reflect  the  real-life  trading  policies  of  many  financial  markets  (see, 
e.g.,  [20-22]  and  references  therein),  where  the  investors  are  allowed  to  buy  or  sell  shares  of 
financial  instruments  only  in  lots  of  standard  size  L,  e.g.,  in  multiples  of  L  =  1.000  shares. 
Following  the  same  setup  as  above,  a  risk-minimizing  portfolio  allocation  problem  with 
lot-buying  constraints  is  formulated  as 

min  {  HMCRaj,(-rTy)  :  E(rTy)  >  r0,  1 1  y  <  1,  y  =  ^  Diag(p)x  | .  (17) 

yeR+,xe^+  ^  j 

Here  L  G  N  is  the  given  lot  size,  C  >  0  is  the  available  capital  (in  dollars),  vector  p  G  M'j_  rep¬ 
resents  the  current  (observable)  asset  prices  per  share,  and  Diag(a)  denotes  a  matrix  whose 
diagonal  elements  are  equal  to  the  corresponding  elements  of  vector  a,  and  off-diagonal  ele¬ 
ments  are  zero.  Similarly  to  the  above,  portfolio  problem  (17)  reduces  to  a  MIpOCP  problem 
with  (m  +  1) -dimensional  p-cone  constraint,  where  m  is  the  number  of  scenarios  in  stochas¬ 
tic  representation  of  the  vector  of  assets’  returns  r.  The  values  of  parameters  L  and  C  in  our 
experiments  were  set  at  L  =  1,000  and  C  =  $100,000. 

For  portfolio  optimization  problems,  we  used  historical  data  for  n  stocks  chosen  at  ran¬ 
dom  from  the  S&P500  index,  and  returns  over  m  consequent  10-day  periods  starting  at  a 
(common)  randomized  date  were  used  to  construct  the  set  of  m  scenarios  for  the  stochastic 
vector  r  in  (16),  (17). 


4.2  Discussion  of  Results:  Conic  MIR  Cuts 

Randomly  generated  MIpOCP  problems  For  each  pair  of  parameters  (n,  m )  that  determine 
the  number  of  integer  variables  and  the  dimensionality  of  p-cone,  50  randomly  generated 
instances  of  problem  (15)  were  solved.  The  results  are  summarized  in  Table  1,  where  the  av¬ 
erage  computational  time  (in  seconds),  the  average  number  of  nodes  explored  in  the  search 
tree,  and  the  average  number  of  cuts  added  during  the  solution  procedure  are  reported.  In 
addition,  we  report  the  percentage  of  cases  in  which  addition  of  conic  MIR  cuts  improves 
the  computational  time  and  the  number  of  nodes  explored,  respectively,  as  compared  to 
the  default  CPLEX  routines.  It  has  also  been  noted  that  randomly  generated  problems  are 
relatively  easy  to  solve;  in  fact,  many  instances  were  solved  at  the  root  node.  Therefore, 
in  addition  to  the  results  averaged  over  all  instances  of  a  given  problem  size  ( n,m ),  Ta¬ 
ble  1  presents  the  results  averaged  over  “difficult”  instances,  i.e.,  instances  that  could  not 
be  solved  at  the  root  node  by  CPLEX  solver  with  default  parameter  settings.  As  one  can 
see,  in  most  cases  utilization  of  conic  MIR  cuts  reduces  the  average  solution  time  and  the 
number  of  nodes  explored  in  the  solution  tree,  with  the  improvement  being  more  noticeable 
for  “difficult”  instances  and  larger  sizes  of  the  problem.  It  is  also  worth  noting  that  while 
solution  times  vary  for  different  values  of  the  parameter  p,  the  observed  improvement  due 
to  implementation  of  conic  MIR  cuts  stays  approximately  the  same. 

Portfolio  optimization  with  cardinality  constraints.  For  each  problem  size  we  generated 
30  problem  instances.  The  obtained  results  are  summarized  in  Table  2.  We  can  again  con¬ 
clude  that  for  the  majority  of  the  instances,  introduction  of  conic  MIR  cuts  leads  to  an 
improved  performance  in  comparison  to  the  default  CPLEX  solution  procedures,  although 
the  improvement  is  considerably  smaller  comparing  to  that  observed  on  randomly  generated 
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Table  1  Performance  of  conic  MIR  cuts  for  randomly  generated  MIpOCP  problems.  The  “%  better”  column 
represents  the  percentage  of  problem  instances  for  which  conic  MIR  cuts  approach  outperformed  CPLEX 
with  default  parameters  in  terms  of  solution  time  and  number  of  nodes,  respectively.  “Difficult”  instances 
represent  problem  instances  that  cannot  be  solved  at  the  root  node. 


p  —  2.0 


all  instances 

“difficult”  instances 

n 

m 

default  CPLEX 

conic  MIR 

%  better 

default  CPLEX 

conic  MIR 

%  better 

time 

26.88 

22.88 

29.41% 

58.22 

43.77 

61.11% 

500 

200 

nodes 

2.0 

0.75 

100.00% 

5.67 

2.11 

100.0% 

cuts 

16.74 

48.65 

- 

16.06 

50.94 

- 

time 

218.0 

224.72 

52.83% 

356.27 

369.85 

67.86% 

600 

nodes 

3.34 

3.17 

92.45% 

6.32 

6.0 

85.71% 

cuts 

73.45 

53.90 

- 

19.08 

55.82 

- 

time 

1 1 17.45 

856.59 

45.61% 

2045.46 

1418.66 

65.22% 

1000 

nodes 

1.68 

0.60 

96.49% 

4.17 

1.48 

91.30% 

cuts 

102.54 

63.40 

- 

76.00 

50.87 

- 

P  = 

3.0 

all  instances 

“difficult”  instances 

n 

m 

default  CPLEX 

conic  MIR 

%  better 

default  CPLEX 

conic  MIR 

%  better 

time 

12.60 

11.10 

37.25% 

24.11 

20.68 

76.92% 

500 

200 

nodes 

0.88 

0.31 

100.00% 

1.23 

3.46 

100.0% 

cuts 

11.71 

49.65 

- 

11.38 

50.94 

- 

time 

189.76 

71.90 

51.92% 

421.64 

133.0 

87.50% 

600 

nodes 

6.92 

2.13 

100.00% 

22.94 

7.06 

100.00% 

cuts 

18.92 

54.58 

- 

15.37 

48.26 

- 

time 

910.04 

560.12 

66.67% 

1741.93 

974.53 

61.90% 

1000 

nodes 

1.53 

0.35 

98.25% 

4.14 

0.95 

95.24% 

cuts 

32.81 

63.40 

- 

22.0 

50.87 

- 

P  = 

4.0 

all  instances 

“difficult”  instances 

n 

m 

default  CPLEX 

conic  MIR 

%  better 

default  CPLEX 

conic  MIR 

%  better 

time 

31.92 

26.54 

35.29% 

62.04 

48.06 

52.17% 

500 

200 

nodes 

2.29 

0.98 

98.04% 

5.09 

2.17 

95.65% 

cuts 

26.16 

48.65 

- 

29.17 

63.83 

- 

time 

582.88 

324.86 

43.40% 

875.88 

471.92 

55.88% 

600 

nodes 

9.25 

8.0 

88.84% 

14.41 

12.47 

82.36% 

cuts 

76.75 

53.91 

- 

37.87 

60.01 

- 

problems.  Note  also  that  a  significantly  smaller  number  of  cuts  were  generated  in  problem 
instances  of  this  type;  moreover,  in  many  cases  the  default  CPLEX  optimizer  did  not  add 
any  cuts  to  the  problem. 


Portfolio  optimization  with  lot-buying  constraints.  The  results  averaged  over  30  instances 
for  each  problem  size  are  summarized  in  Table  3.  Note  that  in  many  instances  of  problems 
of  this  type,  no  user  cuts  of  the  proposed  structure  have  been  found.  It  can  also  be  noted  that 
regardless  of  the  number  of  cuts  found,  solution  times  are  rather  comparable  to  those  of  the 
default  CPLEX  optimizer,  which  may  indicate  that  conic  MIR  cuts  do  not  make  a  significant 
difference  in  problems  of  this  type. 


4.3  Discussion  of  Results:  Lifted  Conic  Cuts 

Portfolio  Optimization.  For  evaluation  of  the  performance  of  lifted  cuts  derived  in  Section 
3,  we  used  both  types  of  portfolio  optimization  problems,  with  parameters  set  up  as  de¬ 
scribed  above.  As  it  has  been  already  noted,  each  lifted  nonlinear  cut  was  replaced  by  its 
outer  gradient  polyhedral  approximation.  Specifically,  the  approximation  accuracy  was  set 
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Table  2  Performance  of  conic  MIR  and  lifted  cuts  in  cardinality  constrained  portfolio  optimization  problems. 
Entries  in  bold  correspond  to  the  minimum  solution  time  for  each  row.  Results  are  averaged  over  30  instances 
for  each  problem  size. 


p  =  2.0 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

100 

600 

360.97 

31.31 

0.10 

315.98 

31.90 

3.00 

281.34 

30.59 

2.00 

1000 

787.16 

31.15 

0.00 

772.44 

77.90 

3.00 

595.66 

30.77 

2.00 

1400 

916.18 

37.58 

0.00 

766.14 

55.50 

3.00 

664.73 

25.8 

2.00 

150 

600 

446.11 

41.80 

0.00 

400.02 

41.20 

3.00 

377.87 

40.20 

2.00 

1000 

1566.79 

53.44 

0.00 

1436.57 

53.20 

3.00 

1326.74 

52.33 

2.00 

1400 

2601.84 

40.69 

0.00 

2343.03 

38.83 

3.00 

2196.61 

39.92 

2.00 

p  —  3.0 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

100 

600 

813.62 

47.93 

0.00 

537.14 

45.63 

3.00 

610.98 

45.35 

2.00 

1000 

1449.75 

49.78 

0.00 

1216.24 

49.90 

3.00 

1213.02 

49.67 

2.00 

1400 

1671.64 

36.38 

0.00 

1518.44 

59.87 

3.00 

1428.81 

40.2 

2.00 

150 

600 

488.07 

41.40 

0.20 

415.92 

40.67 

3.00 

354.40 

39.80 

2.00 

1000 

2877.30 

80.81 

0.05 

2661.90 

83.87 

3.00 

2514.82 

86.71 

2.00 

1400 

4307.80 

70.72 

0.11 

4006.54 

70.43 

3.00 

3739.91 

69.89 

2.00 

o 

-'t 

II 

ft, 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

100 

600 

1234.58 

47.08 

0.10 

1186.99 

45.83 

3.00 

1062.46 

45.58 

2.00 

1000 

2368.82 

45.05 

0.00 

2204.83 

48.20 

3.00 

2062.06 

47.87 

2.00 

1400 

3243.04 

33.49 

0.00 

2630.18 

34.40 

3.00 

2552.70 

31.48 

2.00 

150 

600 

435.52 

34.50 

0.17 

371.95 

58.65 

3.00 

340.62 

33.33 

2.00 

1000 

5913.61 

94.71 

0.00 

5451.90 

47.95 

3.00 

5168.28 

97.57 

2.00 

1400 

6442.82 

62.50 

0.05 

6087.91 

31.30 

3.00 

5286.47 

62.85 

2.00 

at  10~3.  Since  in  this  case  each  cut  results  in  multiple  additional  linear  constraints,  we  re¬ 
stricted  the  number  of  lifted  cuts  to  be  added  at  the  root  node  to  two.  The  results  obtained 
for  portfolio  optimization  problems  with  cardinality  constraints  (16)  and  lot-buying  con¬ 
straints  (17),  each  averaged  over  30  problem  instances,  are  summarized  in  Tables  2  and  3, 
respectively.  We  observed  similar  improvements  in  computational  time  for  both  types  of 
problems.  Also,  it  has  been  observed  that  utilization  of  lifted  cuts  in  portfolio  optimization 
with  lot-buying  constraints  does  not  generally  lead  to  a  reduction  in  the  number  of  nodes 
explored  in  the  solution  tree.  Thus,  based  on  this  observation  and  results  of  the  experiments 
of  the  previous  section,  we  can  suggest  that  the  observed  improvement  is  probably  partially 
due  to  considerably  less  time  spent  while  looking  for  cuts.  In  contrast,  in  portfolio  problems 
with  cardinality  constraints  we  observe  reductions  in  both  the  number  of  nodes  and  solution 
times  due  to  utilization  of  lifted  cuts. 


5  Concluding  Remarks 

The  recent  progress  in  solving  mixed  integer  programming  problems  can  partially  be  at¬ 
tributed  to  the  advances  in  utilization  of  valid  inequalities  for  integer  and  mixed  integer  sets. 
Mixed  integer  cuts  allow  for  tightening  of  the  bounds  given  by  the  continuous  relaxation  of 
the  problem  during  the  branch-and-cut  procedure  and,  as  a  result,  can  lead  to  reductions  in 
the  number  of  nodes  explored  in  the  branch-and-bound  tree  and  in  the  overall  computational 
time.  Typically,  valid  inequalities  exploit  specific  structure  of  the  feasible  set  of  the  problem. 

This  paper  presents  two  families  of  valid  inequalities  for  mixed  integer  p-order  program¬ 
ming  problems  that  arise  in  risk-averse  stochastic  optimization  with  downside  risk  measures. 
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Table  3  Performance  of  conic  MIR  and  lifted  cuts  in  portfolio  optimization  problems  with  lot-buying  con¬ 
straints.  Entries  in  bold  correspond  to  the  minimum  solution  time  for  each  row.  Results  are  averaged  over  30 
instances  for  each  problem  size. 


p  =  2.0 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

10 

200 

9.09 

4.13 

1.50 

9.59 

5.10 

0.00 

8.03 

5.31 

2.00 

600 

45.53 

4.67 

2.61 

40.08 

5.57 

0.13 

32.98 

6.17 

2.00 

1000 

117.78 

11.47 

2.37 

111.44 

13.97 

0.33 

102.81 

14.74 

2.00 

20 

200 

42.49 

20.79 

3.64 

37.17 

23.13 

0.40 

32.00 

25.36 

2.00 

600 

103.28 

12.80 

5.00 

101.67 

16.93 

0.13 

94.96 

20.16 

2.00 

1000 

188.04 

13.63 

3.19 

177.53 

13.83 

1.10 

168.88 

13.63 

2.00 

50 

200 

54.50 

42.94 

4.38 

51.21 

45.40 

0.50 

46.55 

47.44 

2.00 

600 

307.66 

33.19 

6.19 

286.28 

41.27 

1.50 

268.13 

46.75 

2.00 

1000 

640.82 

49.71 

3.71 

635.54 

62.03 

0.00 

664.29 

69.35 

2.00 

p  —  3.0 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

10 

200 

18.56 

4.79 

3.57 

17.33 

7.73 

0.03 

15.50 

9.50 

2.00 

600 

49.60 

8.33 

2.22 

42.32 

9.73 

0.03 

34.46 

10.39 

2.00 

1000 

96.15 

10.19 

2.38 

94.93 

12.97 

0.03 

90.25 

15.38 

2.00 

20 

200 

34.05 

9.06 

3.11 

27.11 

10.97 

1.10 

21.23 

12.00 

2.00 

600 

96.98 

9.51 

4.22 

79.78 

12.00 

1.10 

66.74 

13.84 

2.00 

1000 

130.59 

4.53 

4.35 

134.93 

4.67 

1.23 

141.49 

4.53 

2.00 

50 

200 

78.29 

30.55 

5.10 

70.07 

35.93 

0.03 

57.25 

39.95 

2.00 

600 

316.89 

37.39 

5.33 

275.04 

38.17 

0.03 

210.81 

37.67 

2.00 

1000 

540.25 

22.58 

5.37 

500.46 

36.87 

1.00 

459.55 

47.74 

2.00 

O 

•'3- 

II 

sa. 

default  CPLEX 

conic  MIR  cuts 

lifted  conic  cuts 

n 

m 

time 

nodes 

cuts 

time 

nodes 

cuts 

time 

nodes 

cuts 

10 

200 

23.29 

6.29 

2.29 

17.93 

6.13 

2.00 

13.58 

5.71 

2.00 

600 

44.50 

3.57 

2.21 

41.56 

3.93 

7.03 

37.73 

4.21 

2.00 

1000 

122.08 

8.00 

2.29 

123.10 

10.13 

25.03 

125.04 

12.71 

2.00 

20 

200 

49.11 

7.93 

4.07 

43.88 

16.07 

0.13 

40.19 

20.40 

2.00 

600 

1 10.42 

16.47 

3.31 

101.32 

18.00 

12.50 

89.95 

18.24 

2.00 

1000 

315.87 

10.89 

4.94 

279.44 

11.10 

34.23 

256.45 

10.89 

2.00 

50 

200 

127.20 

43.78 

5.17 

118.54 

46.67 

0.46 

112.06 

48.06 

2.00 

600 

416.48 

36.76 

4.68 

344.87 

33.93 

21.40 

294.47 

29.32 

2.00 

1000 

993.53 

44.50 

5.71 

825.43 

46.20 

33.17 

682.21 

56.59 

2.00 

Particularly,  we  developed  mixed  integer  rounding  cuts  and  nonlinear  lifted  cuts  for  mixed 
integer  p-order  conic  sets,  extending  the  corresponding  results  for  mixed  integer  second  or¬ 
der  programming  problems  [7,8].  Computational  studies  on  randomly  generated  problems 
as  well  as  discrete  portfolio  optimization  problems  with  historical  data  demonstrate  that  both 
conic  MIR  cuts  and  lifted  conic  cuts  lead  to  improved  solution  times. 


In  general,  nonlinear  cuts  are  not  yet  as  prevalent  as  linear  ones,  partly  due  to  the  fact 
that  additional  nonlinear  inequalities  in  the  bounding  (relaxed)  problem  tend  to  have  deterio¬ 
rating  effect  on  the  computational  time  of  branch-and-bound  procedure.  In  order  to  improve 
the  computational  tractability  of  the  derived  nonlinear  lifted  cuts  within  the  branch-and-cut 
framework,  we  proposed  replacing  them  with  their  polyhedral  approximations;  since  the 
nonlinear  lifted  cuts  constitute  low-dimensional  p- cones,  the  corresponding  polyhedral  ap¬ 
proximations  are  relatively  inexpensive.  In  this  respect,  our  computational  results  are  among 
the  first  successful  applications  of  nonlinear  cuts  in  nonlinear  mixed  integer  programming 
problems. 
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A  A  Direct  Derivation  of  Conic  Mixed  Integer  Rounding  Cuts  for  Mixed  Integer 
/7-Order  Cone  Programming  Problems 


Following  [7],  let  us  first  consider  a  simple  case  of  the  following  set 

T  =  {(y,w,r,.x)  G  R+  x  Z  :  [x+y  —  w  —  b]+  <  t}. 


Let  us  denote  by  relax(T)  the  continuous  relaxation  of  T  and  by  conv(T)  its  convex  hull.  It  can  be  seen  that  the 
extreme  rays  of  relax(T)  are  as  follows:  (1,0,0, 1),  (—1,0, 0,0),  (1,0, 1,0),  (—1, 1,0,0),  and  its  only  extreme 
point  is  (b, 0, 0, 0).  Let  us  also  denote  f  —  b—[b\.  Clearly,  the  case  of  /  =  0  is  not  interesting,  hence  it  can  be 
assumed  that  /  >  0,  whereby  conv(T)  has  four  extreme  points:  ([b\  ,0,0,0),  ([&J  ,/,0,0),  (\b~\, 0, 1  — /, 0), 
(\b~\  ,0,0, 1  —  /).  With  these  observations  in  mind  we  can  formulate  the  following  proposition. 


Proposition  A.l  Inequality 


(!-/)(*-  IAI)  <t  +  w 


is  valid  for  T  and  cuts  off  all  points  in  relax  (T)  \  conv(T). 


(18) 


Proof  First,  let  us  show  the  validity  of  (18).  The  base  inequality  for  T  is 

[x+y  —  w—  b]+  <  t.  (19) 

Now,  let  x  —  [b\  —  a  and  a  >  0.  In  this  case,  (19)  turns  into  t  >  \y  —  w  —  f  —  ce]+  and  (18)  becomes 
t  >  — (1  —f)a  —  w.  Observing  that  [y—w—f—  a]+  —  (—(1  —f)a  —  w)  —  max{y— /—  af,  (1  —  f)a+w}  >  0, 
one  obtains  that  (19)  implies  (18)  for*  <L*J- 

On  the  other  hand,  if  x  —  \b]+a  with  a  >  0,  then  (19)  becomes  t  >  \y  —  w+(l—f)  +  ot]+  and  (18) 
turns  into  t  >  (1—  f)(l  +  a)  —  w.  Similarly  to  above, 

[y— w+  (1  — /)  +  a]+  -  ((1  — /)(l  +  a)  —  w) 

=  max{y  —  w  +  (1  — /)  +  a  —  (1  — /)  -  a(l  —  f)+w,w-  (1  — /)(l  +  a)} 

—  ma x{y  +  af,w—  (1  —  /)(1  +  a)}  >  0, 

which  means  that  (19)  implies  (18)  for  x  >  \b~\.  Hence,  (18)  is  valid  for  T. 

To  prove  the  remaining  part  of  the  proposition,  consider  the  polyhedron  T  defined  by  the  inequalities 


x+y—  w—  b  <  t, 

(20) 

0  <t, 

(21) 

0  <  y, 

(22) 

0  <  w, 

(23) 

( 1  —  /)  (jf  —  L*J )  <t+w. 

(24) 

Since  T  has  four  variables,  the  basic  solutions  of  t  are  defined  by  four  of  these  inegualities  at  equality.  They 
are: 

-  Inequalities  (20),  (21),  (22),  (23):  (x,y,  w,?)  =  (&,  0,0,0)  is  infeasible  if  /  0. 

-  Inequalities  (20),  (21),  (22),  (24):  (x,y,w,t)  =  (|~Z?],0, 1  — /, 0). 

-  Inequalities  (20),  (21),  (23),  (24):  (x,y,w,f)  =  (|>J,/,0,0). 

-  Inequalities  (20),  (22),  (22),  (24):  (x,y,w,t)  =  (\b]  ,0,0, 1  —  /). 

-  Inequalities  (21),  (23),  (22),  (24):  (x,y,w,t)  =  ([b\  ,0,0,0). 

Hence,  conv(r)  has  exactly  the  same  extreme  points  as  T ,  which  completes  the  proof.  □ 

In  the  general  case,  let 

f  =  {(y+,y-,r,x)  6R’  x  Z"  :  [aTx+y+  -y~ -b]+  <f},  (25) 

and  consider  the  following  function 


a  (ns  =  /  (!-/)«.  n<  a  <n  +  f 

\  (1  —  f)n+  (a  —  n)  —  /,  n  +  f  <  a  <  n  + 1. 
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Proposition  A.2  For  a  0  the  following  inequality 


I>/|a 

i= i 


ITTi  T77T  ^ 


f+r 


(26) 


/(a|  =  y||  -  [  p  J.  «  valid  for  T. 


Proof  First  consider  the  case  a  —  l.  We  can  rewrite  the  base  inequality  for  (25)  as 

[(  E  Vaj\xj  +  E  r«/i*/)  +  (  E  M+>,+) - (  E  (w.,)*)+:r) -*] ,  < f- 

/;</  /;  '/  //'</  //■>/  + 


where  fj  =  a j  —  [a  j\ .  Observe  that 

(-  E  \.aj\xi  +  E  M*j e z’  E  fm +y+ °>  =  E  C1  -fj)xi+y~ > °- 

/;</  /;>/  //</  //>/ 


Hence,  we  can  apply  simple  conic  MIR  inequality  (18)  with  variables  (x,y,w,t): 

(>-/)(  E  Vai\xi+  E  r°/i*;-iAj)  <?+  E (l-fj)xj+y~- 

fj<f  fj>f  fj>f 


Rewriting  it  with  the  help  of  function  (j)f(a),  we  obtain  that 

n 

E  Maj)xJ  -  Mb)  <‘+y~- 

j=  i 

So,  by  Proposition  A.l  inequality  (26)  is  valid  for  a  —  1.  In  order  to  see  that  the  result  holds  for  all  a  0  we 
only  need  to  scale  the  base  inequality: 

7—7  (aTx  +  v+  —  y~  —  b)l  <f~. 

|a|  J+  \ot\ 

□ 


Proposition  A.3  Inequalities  (26)  with  a  —  a  j,  j  =  1 , . . . ,  n  are  sufficient  to  cut  off  all  fractional  extreme 
points  o/relax(r). 

Proof  The  set  relax(r)  is  defined  by  n  +  3  variables  and  n  +  4  constraints.  Therefore,  if  Xj  >  0  in  an  extreme 
point,  then  the  remaining  n  +  3  constraints  must  be  active.  Thus,  the  continuous  relaxation  has  at  most  n 
fractional  extreme  points  (xf  0,0,0)  of  the  form  xj  =  >  0,  and  xj  =  0,  for  i  ^  j.  Such  points  are  infeasible 

if  j.  ^2.  Now,  let  aj  >  0.  For  such  a  fractional  extreme  point  inequality  (26)  reduces  to 


t+y~ 


which  by  Proposition  A.l  cuts  off  fractional  extreme  point  with  xj-  = 

Now,  let  us  consider  aj  <  0.  In  this  case  we  observe  that  the  inequality  (26)  reduces  to 


t+y- 


-^-f\aj\)xj-(i-f\aj\)[fl\  < 


t+y~ 


which  again,  cuts  off  fractional  extreme  point  with  xj  =  ^ . 
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Abstract.  In  this  work,  we  consider  a  risk-averse  maximum  weighted  &-club  prob¬ 
lems.  It  is  assumed  that  vertices  of  the  graph  have  stochastic  weights  whose  joint 
distribution  is  known.  The  goal  is  to  find  the  &-club  of  minimum  risk  contained  in 
the  graph.  A  stochastic  programming  framework  that  is  based  on  the  formalism  of 
coherent  risk  measures  is  used  to  find  the  corresponding  subgraphs.  The  selected 
representation  of  risk  of  a  subgraph  ensures  that  the  optimal  solutions  are  maximal 
&-clubs.  A  combinatorial  branch-and-bound  solution  algorithm  is  proposed  and  so¬ 
lution  performances  are  compared  with  an  equivalent  mathematical  programming 
counterpart  problem  for  instances  with  k  —  2. 

Keywords.  £-club,  clique  relaxation,  risk-averse  subgraph  problem,  stochastic 
weights,  coherent  risk  measures. 


1.  Introduction 

A  principal  class  of  graph  theoretical  problems  involves  the  identification  of  embodied 
subgraphs  corresponding  to  some  structural  property.  One  particular  setting  of  funda¬ 
mental  importance  entails  finding  the  largest  “perfectly”  cohesive  group  within  a  net¬ 
work  such  that  the  confined  members  are  all  interconnected,  i.e.,  the  largest  clique  ( com¬ 
plete  subgraph).  Several  prominent  studies  founded  the  basis  for  exact  combinatorial  so¬ 
lution  algorithms  for  the  maximum  clique  problem  [1,  2,  3].  In  particular,  Carraghan  and 
Pardalos  [2]  introduced  a  recursive  branch-and-bound  method  for  efficient  finding  maxi¬ 
mum  cliques  by  exploiting  the  heredity  property  [4]  of  complete  subgraphs.  Subsequent 
extensions  of  their  work  enhanced  the  process  of  eliminating  solution  space  via  vertex 
coloring  schemes  for  branching  and  upper-bounds  estimation  on  the  maximal  achievable 
subgraph  sizes  during  the  algorithmic  processing  (e.g.  [5,  6,  7]).  In  many  practical  ap¬ 
plications,  the  requirement  that  the  desired  subgraph  must  be  complete  may,  however, 
impose  excessive  restrictions,  and  warrant  some  structural  relaxation  in  terms  of  member 
connectivity.  As  a  consequence,  several  clique  relaxation  models  have  been  proposed  in 
graph  theory  literature.  A  comprehensive  review  on  clique  relaxation  models  is  provided 
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in  [4].  In  this  work  we  focus  on  a  specific  model,  the  k-club  [8],  where  subgraph  members 
may  also  be  indirectly  connected  via  at  most  k  intermediary  members. 

A  popular  extension  of  the  described  above  class  of  problems  involves  the  impo¬ 
sition  topologically  exogenous  information  in  the  form  of  deterministic  vertex  weights, 
and  correspondingly  finding  a  subset  of  maximum  weight  that  conforms  to  a  defined 
structural  property.  Similar  exact  weight-based  branch-and-bound  solution  techniques 
have  been  developed  for  determining  the  maximum- weight  subgraphs  [9,  10,  11], 

Particular  circumstances  may  further  justify  the  imposition  of  uncertain  exogenous 
information  over  the  graph’s  edges  that  influences  network  flow  distribution,  robustness, 
and  costs  [12,  13,  14,  15,  16,  17],  However,  far  fewer  endeavors  concern  decision  mak¬ 
ing  regarding  optimal  resource  allocation  over  defined  subgraph  topologies  when  uncer¬ 
tainties  are  induced  by  stochastic  factors  associated  with  network  vertices.  In  this  study, 
we  adopt  this  setting  and  extend  the  techniques  introduced  in  [18]  to  address  problems 
seeking  subgraphs  of  minimum  risk  that  represent  a  k-club.  A  statistical  framework  uti¬ 
lizing  the  distributional  information  of  stochastic  vertex  weights  by  means  of  coherent 
risk  measures  [19,  20]  is  employed  to  define  a  risk-averse  maximum  weighted  k-club  (R- 
MWK)  problem  as  finding  the  lowest  risk  k-club  in  a  network.  As  an  illustrative  example, 
we  focus  on  instances  when  k  =  2  and  utilize  a  mathematical  programming  formulation 
for  the  maximum  2-club  problem  introduced  in  [21].  A  branch-and-bound  method  for 
finding  maximum  k-clubs  [22]  is  modified  to  accommodate  the  conditions  of  R-MWK 
problems  by  bounding  solutions  in  a  coherent  risk  measure  context.  We  compare  the 
solution  performance  of  the  proposed  algorithm  relative  to  an  equivalent  mathematical 
programming  counterpart  problem  for  R-MWK  problems  when  k  =  2. 

The  remainder  of  the  paper  is  organized  as  follows.  In  Section  2  we  examine  the 
general  representation  of  R-MWK  problems  and  consider  their  properties.  Section  3 
presents  a  mathematical  programming  formulation  and  a  combinatorial  branch-and- 
bound  method  for  R-MWK  problems  with  k  =  2.  Finally,  Section  4  furnishes  numeri¬ 
cal  studies  demonstrating  the  computational  performance  of  the  developed  branch-and- 
bound  method  on  problems  where  risk  is  quantified  using  higher-moment  coherent  risk 
measures  [23], 


2.  Risk-averse  stochastic  maximum  k-club  problem 

Given  an  undirected  graph  G  =  (V,E)  and  any  subset  of  its  vertices  SC  V,  let  G  [.S’] 
represent  the  subgraph  of  G  induced  by  S  such  that  any  pair  of  vertices  (i.j)  share  an 
edge  in  5  only  if  (i,j)  is  an  edge  in  G.  To  ease  notation,  define  as  a  desired  property 
which  the  induced  graph  G  [.S']  must  satisfy.  The  present  work  considers  the  case  when  .2 
represents  a  certain  relaxation  of  the  completeness  property,  such  that  a  subgraph  with 
property  2  represents  a  clique  relaxation. 

Depending  on  the  characteristic  of  a  complete  graph  that  is  relaxed,  the  clique  relax¬ 
ations  can  be  categorized  into  density-based ,  degree-based,  and  diameter-based  relax¬ 
ations.  The  density  of  a  graph  G  =  ( V,E)  is  defined  as  a  ratio  D(G)  =  \E\/ (^),  where 
the  denominator  represents  the  number  of  edges  in  a  complete  graph  with  |F|  vertices. 
Evidently,  a  complete  graph  (clique)  has  a  density  of  1.  Then,  for  a  fixed  y  €  (0, 1),  graph 
G  is  called  a  y-quasi-clique  [24],  if  its  density  is  at  least  y: 
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D(G)  >  7,  or,  equivalently,  \E\  >  ^  J  . 

The  7-quasi-clique  is,  therefore,  a  density-based  relaxation  of  the  clique  concept,  and  as 
such  is  different  from  the  k-clique,  which  is  one  of  the  diameter-based  clique  relaxations. 
Namely,  let  dc{i,j )  be  the  distance  between  nodes  i,j  £  V,  measured  as  the  number  of 
edges  in  the  shortest  path  between  i  and  j  in  G.  Then,  the  subgraph  G[S]  induced  by  a 
subset  of  nodes  S  C  V  of  the  graph  G  is  called  a  A-clique  if 

max  do{iJ)  =  A. 
i,jes 


Note  that  the  definition  of  the  A-clique  does  not  require  that  the  shortest  path  between 
i,j  £  S  belong  to  G  [5] .  If  one  requires  that  the  shortest  path  between  any  two  vertices  i,  j 
in  5  belong  to  the  induced  subgraph  G[S],  then  the  subset  S  such  that 

dG\s}(iJ)  =  k,  (1) 

ijes  1  J 

is  called  a  k-club.  Note  that  a  A-club  is  also  a  A-clique,  while  the  inverse  is  not  true  in 
general.  The  shortest  path  connecting  two  vertices  in  a  clique  is  1,  thus  1 -clique  and  1- 
club  are  cliques.  For  a  vertex  i  £  V ,  its  degree  degG(i)  is  defined  as  the  number  of  adjacent 
vertices:  degG(i)  =  {j  £  V  :  ( i,j )  £  E} |.  A  degree-based  clique  relaxation,  known  as 
k-plex,  is  defined  as  a  subset  S  of  V  such  that  the  degree  of  each  vertex  in  the  induced 
subgraph  G[S]  is  at  least  |5j  —  A  [25]: 

deSG[5]  (0  >  \S\  ~  k  for  all  i  £  S, 

(observe  that  the  degree  of  each  vertex  in  a  clique  of  size  n  is  equal  to  n  —  1). 

The  present  work  considers  the  case  when  J2  represents  a  distance-based  relaxation 
of  the  clique  model  in  the  sense  of  A-club  definition  (1)  when  A  >  2.  Throughout  the 
remainder  of  this  study  we  let  property  &g[S]  define  a  A-club  as 

&GIS\  ={SCV\Vi,j£S:  dG[S]  ( i ,  j)<k}.  (2) 

A  popular  instance  of  graph-theoretic  problems  arises  when  seeking  a  subgraph  S 
with  the  maximum  additive  vertex  weights,  w;  >  0,  that  satisfies  property  £}q[S\-  When 
&G[S\  is  defined  by  (2)  a  maximum  weight  k-club  problem  can  take  the  form 

max  |  w,- :  G[S]  satisfies  £!g[S\  }•  (3) 

Clearly,  the  optimal  subgraph  G[S]  in  problem  (3)  will  be  maximal,  but  not  necessarily 
the  maximum  (of  the  largest  order)  subgraph  with  property  A?g[S]- 

In  this  work,  we  consider  an  extension  of  problem  (3)  that  assumes  stochastic  vertex 
weights.  In  this  case,  a  direct  translation  into  a  stochastic  framework  is  not  trivial  due 
to  the  fact  that  the  maximization  of  random  weights  would  be  ill-posed  in  context  of 
stochastic  programming  resulting  from  the  absence  of  a  deterministic  optimal  solution. 
Likewise,  maximization  of  the  expected  weight  of  the  sought  subgraph  is  not  interesting 
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in  the  sense  that  it  reduces  to  the  deterministic  version  of  the  problem  presented  above.  A 
more  suitable  approach,  thus,  involves  computing  the  subgraph’s  weight  via  a  statistical 
functional  that  utilizes  the  distributional  information  about  the  weights’  uncertainties, 
rather  than  as  a  simple  sum  of  its  (random)  weights.  To  this  end,  we  pursue  a  risk-averse 
approach  so  as  to  find  the  subgraph  of  G  that  has  the  lowest  risk  and  satisfies  the  property 
£}.  Let  Xj  denote  random  variables  that  represent  costs  of  losses  associated  with  vertices 
i  G  V,  such  that  the  joint  distribution  of  vector  Xg  =  (X\ , . . .  ,X  V/ 1 )  is  known.  The  problem 
of  finding  the  minimum-risk  subgraph  in  G  with  property  T2,  or  the  risk-averse  maximum 
weighted  i?  problem  take  the  form: 

min  {^(S;Xg)  :  G[S]  satisfies  i?},  (4) 

where  <^(S;Xg)  is  the  risk  of  the  induced  subgraph  G[S]  given  the  distributional  infor¬ 
mation  Xg- 

A  formal  representation  of  risk  M(S\  XG)  is  invoked  via  the  well-known  concept  of 
risk  measure  in  stochastic  optimization  literature  [26].  Namely,  given  a  probability  space 
(£1,  JP,  P),  where  £1  is  the  set  of  random  events,  &  is  the  cr-algebra,  and  P  is  a  probability 
measure,  a  risk  measure  is  defined  as  a  mapping  p  :  3£  K >  R,  where  is  a  linear  space 
of  ^-measurable  functions  X  :  £2  i->  EL  Further,  assuming  that  risk  measure  p  is  lower 
semi-continuous  (l.s.c.),  the  risk  &(S\XG)  of  subgraph  of  G[S]  with  uncertain  vertex 
weights  Xj  can  be  defined  as  an  optimal  value  of  the  following  stochastic  programming 
problem: 


fi#(S\XG)  =  min  <  p 


ttjXj 

ieS 


■ 

ieS 


=  1, 


i  G  S 


(5) 


Notice  that  this  definition  of  the  subgraph  risk  function  £%{■)  admits  risk  reduction 
through  diversification  as  illustrated  by  the  following  proposition: 

Proposition  1  ([18])  Given  a  graph  G  =  (V,E)  with  stochastic  weights  Xj,  i  G  V,  and  a 
l.s.c.  risk  measure  p,  the  subgraph  risk  function  Sf  defined  by  (5)  satisfies 

@(S2;XG)<@(Si;XG)  forall  SxCS2.  (6) 

The  following  observation  regarding  the  optimal  solution  of  the  risk-averse  maxi¬ 
mum  weighted  .-5  problem  (4)  stems  directly  from  property  (6): 

Corollary  1  There  exists  an  optimal  solution  of  the  risk-averse  maximum  weighted  J3 
problem  (4)  with  ff\S\XG)  defined  by  (5)  that  is  a  maximal  £}- subgraph  in  G. 

Additional  properties  of  M(S\  Xg)  ensue  from  the  assumption  that  risk  measure  p 
belongs  to  the  family  of  coherent  measures  of  risk.  Namely,  the  definition  of  p  is  aug¬ 
mented  with  the  properties  of  monotonicity,  subadditivity,  transitional  invariance,  and 
positive  homogeneity  (see  [19]).  Assuming  that  risk  measure  p  in  (5)  is  coherent,  or  sat¬ 
isfies  the  first  three  properties  and  is  l.s.c,  then  the  corresponding  subgraph  risk  func¬ 
tion  M(S\  Xq)  satisfies  analogous  properties  with  respect  to  the  stochastic  weights  vector 

XG, 
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(Gl)  monotonicity:  &(S:Xg)  <  &(S:Yg)  for  all  Xg  <  Yg; 

(G2)  positive  homogeneity :  ^(S;AXg)  =  A^(S;Xg)  for  all  Xg  and  A  >  0; 

(G3)  transitional  invariance:  &(S:Xg  +  cil)  =  ^(5;Xg)  +a  for  all  a  G  R; 

where  1  is  the  vector  of  ones,  and  the  vector  inequality  Xg  <  Yg  is  interpreted 
component-wise. 

Observe  that  &(S:Xg)  violates  the  sub-additivity  requirements  with  respect  to  the 
stochastic  weights.  However,  risk  reduction  via  diversification  is  guaranteed  by  (6), 
which  ensures  that  the  inclusion  of  additional  vertices  to  the  existing  feasible  solution 
is  always  beneficial.  Further,  under  an  assumption  of  non-negative  stochastic  vertex 
weights,  Xg  >  0,  the  subgraph  risk  3i(S:  Xq)  can  be  shown  to  be  subadditive  in  relative 
to  induced  subgraphs  in  G, 

J,(5iU52;Xg)<^(5i;Xg)+^(52;Xg),  SuS2CV.  (7) 

Clearly,  it  is  required  that  Si,  .S'2,  and  .S’ i  U ,S-2  satisfy  property  2  in  conformance  to  the 
context  of  risk-averse  maximum  weighted  2  problems. 


3.  Solution  approaches  for  risk-averse  maximum  weighted  2-club  problems 

In  this  section  we  consider  a  mathematical  programming  formulation  for  the  R-MWK 
problem  when  k  =  2,  and  where  the  risk  Sf(S)  of  induced  subgraph  G  [.S']  is  defined  by 
(5).  Also,  we  propose  a  combinatorial  branch-and-bound  algorithm  utilizing  the  solu¬ 
tion  space  processing  principals  for  finding  maximum  A-clubs  introduced  by  Pajouh  and 
Balasundaram  [22], 


3.1.  A  mathematical  programming  formulation 


Let  binary  decision  variables  Xj  indicate  whether  node  i  G  V  belongs  to  a  subset  S: 


Xi  = 


i  G  S  such  that  G[S]  satisfies  J2 
otherwise. 


When  the  property  2  denotes  a  2-club,  one  can  choose  the  edge  formulation  of  the  max¬ 
imum  2-club  problem  proposed  by  Balasundaram  et  al.  [21],  whereby  the  mathematical 
programming  formulation  of  the  R-MWK  problem  with  k  =  2  takes  the  form 

min  p(  £  m,Y,) 

v  iev 

s.  t.  Y*Ui  =  1> 
iev 

Hi  <  Xi,  i  G  V,  ® 

Xi+Xj-  Y,  */<l, 

ieyVn(i,j) 

Xj  G  {0, 1},  itj  >0,  i  G  V, 
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where  E  represents  the  complement  edges  of  graph  G,  and  ■V'(i.j)  denotes  the  vertices 
that  are  both  adjacent  to  vertex  i  and  vertex  j.  Appropriate  (nonlinear)  mixed  integer 
programming  solvers  can  be  used  to  solve  formulation  (8)  with  risk  measures  p  whose 
representations  admits  some  form  of  mathematical  programming  problems.  A  combina¬ 
torial  branch-and-bound  algorithm  for  solving  R-MWK  problems  is  described  next. 

3.2.  A  combinatorial  branch-and-bound  algorithm 

The  following  branch-and-bound  (BnB)  algorithm  for  solving  R-MWK  problems  entails 
efficient  processing  of  solution  space  by  traversing  “levels”  of  the  BnB  tree  until  a  sub¬ 
graph  G  [.S’]  that  represents  a  maximal  2-club  of  minimum  risk  in  G  as  measured  by  (5) 
is  found.  The  algorithm  begins  at  level  t  =  0  with  a  partial  solution  Q  :=  0,  incumbent 
solution  Q*  :=  0,  and  an  upper  bound  on  risk  L*  :=  +°°  (risk  induced  by  Q* ),  where 
Q  consists  of  the  vertices  of  the  induced  subgraph  with  property  32,  and  Q*  contains 
vertices  corresponding  to  a  maximal  ^-subgraph  whose  risk  equals  L*  in  G.  A  set  of 
“candidate”  vertices  Q  is  maintained  at  each  level  (,  from  which  a  certain  branching 
vertex  q  is  selected  and  added  to  the  partial  solution  Q,  or  simply  deleted  from  set  Q 
without  being  added  to  Q.  In  order  to  ensure  that  the  proper  vertices  are  removed  from  Q 
when  the  algorithm  backtracks  between  levels  of  the  BnB  tree,  we  introduce  set  F  :  =  0 
to  account  for  the  levels  at  which  nodes  were  created  to  delete  a  vertex  q  from  Q. 

Due  to  the  distance-based  properties  of  k-clubs,  considerations  are  warranted  upon 
transferring  or  deleting  a  vertex  q  from  candidate  set  Q,  as  the  structural  integrity  of 
corresponding  to  the  graph  induced  by  Q  and  the  candidate  set  at  the  subsequent  level 
UQ+i  may  be  affected.  Thus,  the  removal  of  q  from  Q  to  add  to  Q,  and  the  deletion  of  q 
from  Q  without  adding  it  to  Q  are  considered  independently  via  the  construction  of  two 
BnB  tree  nodes  for  any  given  current  node  at  level  £.  The  first  node  is  created  to  include 
q  in  Q,  while  the  other  to  delete  q  from  Q.  The  necessary  structural  properties  of  Q  and 
C(+ 1  at  each  node  are  described  next. 

Consider  a  k-clique  in  graph  G  as  a  subset  S  that  satisfies 

{SCV\Wi,jeS:dG(iJ)<k}, 

and  observe  that  any  k-club  in  G  also  satisfies  the  properties  of  a  k-clique,  while  a  k- 
clique  is  not  necessarily  a  k-club  for  k  >  2.  Further,  both  reduces  to  a  complete  graph 
in  the  case  of  k  =  1 .  By  this  notion,  an  incumbent  solution  Q  defines  a  k-club  if  the 
following  conditions  are  maintained  for  all  graphs  G[QJCg+i]: 

(Cl)  Q  is  a  k-clique  in  G[<2UQ+i] 

(C2)  dG[QUCf+l}(i,j)  <k,  \/i£Q,  V/  €  Q+i 

The  algorithm  is  then  initialized  with  Co  :=  V .  Whenever  a  vertex  q  is  selected  from 
Cf  and  added  to  Q,  the  candidate  set  at  level  li  +  1  must  be  accordingly  constructed  by 
removing  all  vertices  from  Q  whose  distances  to  vertex  in  q  are  larger  than  k. 


Q+t  {./  G  Q  :  dG{Qyjce\(qJ)  <  k}. 


In  situations  when  the  deleted  vertices  serve  as  intermediaries,  their  removal  from  Cp 
may,  however,  impose  pairwise  distance  violations  among  the  vertices  in  Q  U  q  with  re¬ 
spect  to  condition  (C2).  In  other  words,  after  removing  vertex  q  from  Cp,  the  distance 
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between  a  pair  of  vertices  (i,  j)  £  Q  follows  £?g[Quq+i]  (f  j)  >  k.  In  such  cases,  the  corre¬ 
sponding  node  of  the  BnB  tree  is  fathomed  and  the  algorithm  backtracks  to  level  L  If  a 
BnB  tree  node  is  created  to  delete  vertex  q ,  the  candidate  set  Q+ 1  is  likewise  constructed 
by  eliminating  vertices  that  violate  (C2).  If  the  removal  of  vertices  from  the  candidate 
sets  in  either  of  the  above  cases  results  in  a  violation  of  (Cl),  then  the  corresponding 
BnB  node  is  fathomed. 

The  subsequent  step  entails  evaluating  the  quality  of  the  solution  that  can  be  obtained 
from  the  subgraph  induced  by  vertices  in  Q\JCp+i.  An  exact  approach  of  directly  finding 
the  2-club  with  the  lowest  possible  risk  that  is  contained  in  G[QUCp+\]  would  involve 
solving  problems  (8)  where  x,-  =  0,  i  £  V  \  (QGCp+\).  However,  solving  a  mixed  0- 
1  problem  at  every  node  of  the  BnB  tree  is  impractical,  and  a  lower  bound  problem  is 
obtained  by  eliminating  variables  x,-, i  £  V,  and  the  graph  structural  constraints, 

&(Q UQ+i;Xg)  >  £?(Q UQ+i)  :=min  p(  Y,uixi) 

v  iev  ' 

s.  t.  V  Ui—  1 

tv  (9) 

Ui  =  0,  i  £V\(QUCe+i) 

Ui  >  0,  iegUQ+i. 

This  notion  admits  the  assumption  that  G[QUCp+i]  is  a  2-club,  under  which  all  the 
mentioned  graph  structural  constraints  would  be  satisfies  and  thus  vanish.  Therefore,  by 
virtue  of  Proposition  1,  the  solution  to  (9)  provides  a  lower  bound  on  the  risk  achievable 
by  any  2-club  contained  in  the  graph  induced  via  the  union  of  vertices  in  Q  and  any 
subset  of  vertices  in  Q+ 1 .  As  a  result,  the  risk  at  any  subsequent  level  t'  along  the  current 
branch  of  the  BnB  tree  cannot  deteriorate  as  the  set  Q\JCpj+\  is  refined. 

The  computed  values  of  J2?(<2  IJ  Cp+n)  determine  whether  the  algorithm  branches 
further  or  prunes/backtracks.  If  (QUCp+n)  >  L* ,  then  the  corresponding  branch  of  the 
BnB  tree  is  fathomed  due  to  the  fact  that  sequential  refinement  can  not  achieve  a  further 
reduction  in  risk.  If  Q  ^  0,  another  branching  vertex  is  selected  and  either  removed  from 
Cf  and  added  to  Q,  or  deleted  from  Q.  Alternatively,  if  Q  =  0,  the  algorithm  backtracks 
to  level  i  —  1 . 

In  the  case  when  if(QUQ+i)  <  L*  and  Q+ 1  ^  0,  the  a  branching  vertex  q  is 
selected  at  the  next  level  (,+  1.  In  the  case  of  Jf(QUCp+i)  <  L*  and  Cp+  \  =  0,  the  G[Q] 
represents  a  maximal  2-club  in  G  and  is  assigned  as  the  new  incumbent  solution,  Q*  \—Q, 
and  the  global  upper  bound  on  risk  is  updated  L*  :=  if(QUQ+i).  The  algorithm  then 
backtracks  to  level  i—  1 . 

Empirical  experimental  observations  suggest  that  branching  on  a  vertex  q  with  the 
smallest  value  of  p  (Xq)  or  EA(/  can  significantly  enhance  computational  performance.  To 
this  end,  the  vertices  in  any  candidate  set  Q  are  ordered  in  descending  order  with  respect 
to  their  risks  p(2Q)  or  expected  values  EX),  and  the  last  vertex  in  Q  is  always  selected 
for  branching. 

The  described  branch-and-bound  algorithm  procedure  for  R-MWK  problems  is  for¬ 
malized  in  Algorithm  1 .  Notice  that  it  is  applicable  to  any  positive  integer  value  k. 
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Algorithm  1  Graph-based  branch-and-bound  method  for  problem  (8) 

1.  Initialize:  l  :=  0;  C0  :=  V;Q  :=  0;  Q*  :=  0;  L*  :=  ~;F  :=  0; 

2.  While  (not  STOP)  do 

3.  if  C(  ^  0  then 

4.  select  a  vertex 

5.  Ce:=Ce\q; 

6.  Q:=gU§; 

7.  Q+i  :=  {/  G  Q  :  <k\/i€  Q}; 

8.  if  Q  is  ai-clique  in  G[OUQ+il  then 

9.  solve  if  (QUQ+i); 

10.  ifif(QUQ+i)  <L*  then 

11.  if  Q+1  ^  0  then 

12.  l:=i+l\ 

13.  else 

14.  Q*  :=  Q\ 

15.  L*:=if(2UQ+i); 

16.  Q-=Q\q; 

17.  if  (  ^  F  then 

18.  Q  ■=  Q\q 

19.  Q+i  :=  {j  G  Q  :  4g[GuC»]  (*> ./)  <  C  Vi  6  2, }; 

20.  if  Q+i  ^ 0  then 

21.  if  2  is  a  l:-clique  in  G[2UQ+i]  then 

22.  F  :=  FUf; 

23.  go  to  step  9; 

24.  else 

25.  go  to  step  3; 

26.  else 

27.  F F\i; 

28.  else 

29.  iff  ^F  then 

30.  Q-=Q\q\ 

31.  else 

32.  F:=F\f; 

33.  else 

34.  Q-=Q\q; 

35.  Q+i  :=  {i  G  Q  :  <7g[quc,>](6./)  <  Vi  G  Q,}: 

36.  if  2  is  a  f-clique  in  G[2UQ+i]  then 

37.  /•  : 

38.  go  to  step  9; 

39.  else 

40.  go  to  step  3; 

41.  else 

42.  f:=f-l; 

43.  iff=  —  1  then 

44.  STOP 

45.  iff  ^F  then 

46.  Q:=Q\q; 

47.  else 

48.  F  =  F\l: 

49.  return  2* 


4.  Case  study:  Risk-averse  maximum  weighted  2-club  problem  with  higher 
moment  coherent  risk  measures 


In  this  section  we  present  a  computational  framework  for  problem  (8)  and  conduct  nu¬ 
merical  experiments  demonstrating  the  computational  performance  enhancements  as¬ 
sociated  with  the  proposed  BnB  algorithm.  We  adopt  higher-moment  coherent  risk 
(HMCR)  measure  class  that  was  introduced  in  [23]  as  optimal  values  to  the  following 
stochastic  programming  problem: 

HMCRa,p(A)=min  i7  +  (l-a)-1||(A-77)+|L  a€(0,l),  p>  1,  (10) 

neR  "  "P 
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where  X+  =  max{0,X}  and  ||X||p  =  (E|X|/:I) l,  p.  Mathematical  programming  problems 
that  contain  HMCR  measures  can  be  formulated  using  p-order  cone  constraints.  Typi¬ 
cally,  in  stochastic  programming  models,  the  set  of  random  events  C2  is  assumed  to  be 

discrete,  Q  =  {©i , . . . ,  <%},  with  the  probabilities  P(<B* )  =  Kk>  0,  and  H\  -\ - 1-  =  1  • 

The  corresponding  mathematical  programming  model  (8)  with  p(X)  =  HMCR(,  a(li) 
takes  the  following  mixed  integer  p-order  cone  programming  form: 

min  r;  +  (1  -  a)~lt 

s.  t.  t  >  HCvi,  -  -  -  ,yjv)||p, 

itk1/pyk>  uixik-f],  k  =  i,...,N, 
iev 

=  (ii) 

iev 

Ui  <  Xi ,  i  G  V, 

Xi+Xj-  Y  (i,j)£E, 

Xi  e  {0,1},  Ui  >  0,  i  G  V;  yk  >  0,  k  =  1, . . .  ,N, 

where  Xik  represents  the  realization  of  the  stochastic  weight  of  vertex  i  €  V  under  scenario 
k  £  .  Analogously,  the  lower  bound  problem  (9)  takes  the  form 

Jf(QUC(+i)  =  min  t7  +  (l  — a)_1f 

s.  t.  t  >  ||yI,...,ytv||p, 

nl 1  /pyk  >  Y  u‘Xik  ~  V ,  k  =  l,...,N, 
iev 

!«,-  =  1,  (12) 
iev 

ui>  0,  ieQUQ+i, 

Ui  =  0,  i  £V  \  (QUQ+i), 

0,  k= 

For  instances  when  p  =  1  or  2,  problems  (11)  and  (12)  reduce  to  linear  programming  (LP) 
and  second  order  cone  programming  (SOCP)  models,  respectively.  However,  in  cases 
when  when  p  £  ( 1 , 2)  U  (2,  °°)  the  p-cone  is  not  self-dual  and  there  exist  no  efficient  long- 
step  self-dual  interior  point  solution  methods.  Consequently,  we  employ  the  methods 
for  representing  p-order  cones  into  a  higher  dimensional  space  [27]  that  are  based  on 
polyhedral  approximations  of  p-order  cones  and  representation  of  rational-order  p-cones 
via  second  order  cones. 

4.1.  Setup  of  the  numerical  experiments  and  results 

Numerical  experiments  of  the  risk-averse  maximum  weighted  2-club  problem  were  con¬ 
ducted  on  randomly  generated  Erdos-Renyi  graphs  of  orders  |V|  =  25,50, 100  with  av- 
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erage  densities  d  =  0.0125,0.025,0.05,0.1,0.15.  The  specified  edge  probabilities  were 
chosen  due  to  empirical  observations  indicating  that  a  graph  of  order  \V  \  >  50  commonly 
reduces  to  a  2-club  when  the  density  is  in  the  range  [0.15,0.25].  The  stochastic  weights 
of  graphs’  vertices  were  generated  as  i.i.d.  samples  from  the  uniform  1/(0, 1)  distribu¬ 
tion.  Scenario  sets  with  N  =  100  were  generated  for  each  combination  of  graph  order 
and  density.  The  HMCR  risk  measure  (10)  with  p  =  1,2,3,  and  a  =  0.9  was  used. 

The  BnB  algorithm  has  been  coded  in  C++,  and  we  used  the  CPLEX  Simplex  and 
Barrier  solvers  for  the  polyhedral  approximations  and  SOCP  reformulations  of  the  p- 
order  cone  programming  lower  bound  problem  (12),  respectively  (see  [27]).  For  in¬ 
stances  when  p  =  1,  the  CPLEX  Simplex  solver  was  utilized  to  solve  problem  (12)  di¬ 
rectly.  The  computations  were  conducted  on  an  Intel  Xeon  3.30GHz  PC  with  128GB 
RAM,  and  the  CPLEX  12.5  solver  in  Windows  7  64-bit  environment  was  used. 

The  computational  performance  of  the  mathematical  programming  model  (11)  was 
compared  with  that  of  developed  BnB  algorithm.  In  the  case  of  p  =  1,  problem  (11)  was 
solved  with  CPLEX  MIP  solver.  The  CPLEX  MIP  Barrier  solver  was  used  for  the  SOCP 
version  in  the  case  of  p  =  2,  and  using  the  SOCP  reformulation  in  the  case  of  p  =  3. 

Table  1  presents  the  computational  times,  averaged  over  five  instances.  Observe  that 
the  BnB  algorithm  outperforms  the  CPLEX  MIP  solver  over  all  the  listed  graph  config¬ 
urations,  and  one  to  two  orders  of  magnitude  in  performance  improvements  were  wit¬ 
nesses  for  the  majority  of  instances.  Further,  the  relative  differences  in  performance  also 
become  more  pronounced  with  an  increase  in  p.  Also  noteworthy  is  improvement  in  rel¬ 
ative  performance  of  the  BnB  method  for  problems  with  p  =  3  in  comparison  to  p  =  2. 
This  results  from  properties  of  the  cutting-plane  algorithm  for  solving  polyhedral  ap¬ 
proximations  of  p-order  cone  programming  problems,  which  becomes  more  effective  as 
p  increases  [27]. 


d  =  0.0125 

d  = 

0.025 

d  = 

0.05 

d  = 

0.1 

d  = 

0.15 

p 

|V| 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

CPLEX 

BnB 

1 

25 

0.47 

0.06 

0.54 

0.04 

0.46 

0.04 

0.31 

0.04 

0.32 

0.08 

50 

1.32 

0.13 

0.74 

0.14 

0.79 

0.18 

1.29 

0.33 

2.47 

1.91 

100 

1.99 

0.07 

3.25 

0.38 

6.00 

2.19 

57.62 

40.90 

- 

- 

2 

25 

11.00 

0.56 

9.63 

0.72 

6.24 

0.33 

6.38 

0.37 

10.57 

0.43 

50 

16.20 

0.69 

14.89 

0.52 

19.01 

0.46 

46.19 

1.10 

167.51 

4.91 

100 

38.25 

0.61 

119.15 

1.15 

253.27 

2.91 

973.18 

70.45 

- 

- 

3 

25 

40.48 

0.90 

25.65 

0.81 

15.53 

0.42 

15.26 

0.66 

27.25 

0.86 

50 

35.89 

1.11 

31.80 

1.21 

42.39 

1.09 

90.74 

1.55 

232.49 

5.36 

100 

70.47 

1.08 

188.71 

1.54 

316.38 

3.13 

1455.73 

62.73 

- 

- 

Table  1.  Average  computation  times  (in  seconds)  obtained  by  solving  problem  (8)  using  the  proposed  BnB 
algorithm  and  CPLEX  with  risk  measure  (10)  and  scenarios  N  —  100.  All  running  times  are  averaged  over  5 
instances  and  symbol  “ — •”  indicates  that  the  time  limit  of  7200  seconds  was  exceeded. 


5.  Conclusions 

We  have  considered  a  R-MWK  problems  which  entail  finding  a  k-club  of  minimum  risk 
in  a  graph.  HMCR  risk  measures  were  utilized  for  quantifying  the  distributional  infor¬ 
mation  of  the  stochastic  factors  associated  with  vertex  weights.  It  was  shown  that  the 
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optimal  solutions  to  R-MWK  problems  are  maximal  A'-clubs.  A  combinatorial  BnB  so¬ 
lution  algorithm  was  developed  and  tested  on  a  special  case  of  the  R-MWK  problem 
when  k  =  2.  Numerical  experiments  on  randomly  generated  graphs  of  various  configu¬ 
rations  suggest  that  the  proposed  BnB  algorithm  significantly  reduces  solution  times  in 
comparison  with  the  mathematical  programming  model  solved  using  the  CPLEX  MIP 
solver. 
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On  /?-norm  linear  discrimination 

Yana  Morenko*  Alexander  Vinci*  Zhaohan  Yu*  Pavlo  Krokhmal*’^ 


Abstract 

We  consider  a  p-norm  linear  discrimination  model  that  generalizes  the  model  of  Bennett  and  Man- 
gasarian  (1992)  and  reduces  to  a  linear  programming  problem  with  p-order  cone  constraints.  The  proposed 
approach  for  handling  linear  programming  problems  with  p-order  cone  constraints  is  based  on  reformula¬ 
tion  of  p-order  cone  optimization  problems  as  second  order  cone  programming  (SOCP)  problems  when  p 
is  rational.  Since  such  reformulations  typically  lead  to  SOCP  problems  with  large  numbers  of  second  order 
cones,  an  “economical”  representation  that  minimizes  the  number  of  second  order  cones  is  proposed.  A 
case  study  illustrating  the  developed  model  on  several  popular  data  sets  is  conducted. 


1  Introduction 

Consider  two  discrete  sets  A,1 3  C  R"  containing  k  and  m  points,  respectively:  A  =  {ai, . . .  .a*},  23  = 
{b  i , . . . ,  bm}.  One  of  the  principal  tasks  arising  in  machine  learning  and  data  mining  is  that  of  discrimination 
of  such  sets,  namely,  constructing  a  surface  /(x)  =  0  such  that  /(x)  <0  for  any  x  e  A  and  /(x)  >  0  for  all 
x  e  23.  Of  particular  interest  is  the  linear  separating  surface  (hyperplane):  /(x)  =  wTx  —  y  =  0.  From  the 
simple  fact  that  any  two  points  yi,  y2  €  R”  satisfying  the  inequalities  wTyi  —  y  >  0,  wTy2  —  y  <  0  for 
some  w  and  y  are  located  on  the  opposite  sides  of  the  hyperplane  wTx—  y  =  0,  it  follows  that  the  discrete  sets 
A,  23  C  R"  are  considered  linearly  separable  if  and  only  if  there  exist  w  G  R"  such  that  wTa,-  >  y  >  wTb; 
for  all  i  =  1 . k,  j  =  I .... ,  in,  with  an  appropriately  chosen  y,  or,  equivalently, 

min  aj  w  >  max  bjw.  (1) 

a,  e  A  by  e  ®  J 

Clearly,  existence  of  such  a  separating  hyperplane  is  not  guaranteed  (namely,  a  separating  hyperplane  exists 
if  the  convex  hulls  of  sets  A  and  23  are  disjoint);  thus,  in  general,  a  separating  hyperplane  that  minimizes 
some  sort  of  misclassification  error  is  desired. 

In  the  next  section  we  introduce  a  new  linear  separation  model  that  is  based  on  p-order  cone  program¬ 
ming,  and  discuss  its  key  properties.  The  proposed  solution  approach,  based  on  a  reformulation  of  p-cone 
programming  problems  as  second  order  cone  programming  (SOCP)  problems  when  p  is  rational,  is  pre¬ 
sented  in  Section  3.  Section  4  contains  a  case  study  on  several  popular  data  sets  that  illustrates  the  developed 
discrimination  model. 


2  /7-Norm  linear  separation:  A  stochastic  optimization  analogy 

Since  definition  (1)  involves  strict  inequalities,  it  is  not  well  suited  for  mathematical  programming  models  of 
selecting  the  “best”  linear  separator.  However,  the  fact  that  the  separating  hyperplane  can  be  scaled  by  any 
non-negative  factor  allows  one  to  formulate  the  following  observation: 

Proposition  1  ([4])  Discrete  sets  A,  23  C  R"  represented  by  matrices  A  =  (ax, . . ,ajt)T  €  R/cx”  and 
B  =  (bi . bm)T  e  Rmx" ,  respectively,  are  linearly  separable  if  and  only  if 

Aw  >  ey  +  e,  Bw  <  ey  —  e  for  some  w  e  R",  y  e  R,  (2) 
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where  e  is  the  vector  of  ones  of  an  appropriate  dimension,  e  =  (1, . . . ,  1)T. 

Given  the  linear  separability  condition  (2),  the  (non-negative)  vectors  =  (—Aw  +  ey  +  e)+,  x®  = 
(Bw  —  ey  +  e)+,  where  t+  =  max{0,  t},  represent  misclassification  errors:  \j\  and/or  x®  >  0  if  sets  A 
and  23  are  not  linearly  separable.  If  one  considers  that  points  of  sets  A  and  23  represent  realizations  of 
(discretely  distributed)  random  vectors  a,  b  €  R",  respectively,  the  corresponding  elements  of  vectors  \j\, 
x%  may  be  regarded  as  realizations  of  random  variables  XA  (a:  w,  y)  =  (— aTw  +  y  +  1)+,  A®(b;  w,  y)  = 
(bTw  — y  +  1)+,  respectively,  that  depend  parametrically  on  the  decision  variables  w  and  y.  Then,  a  plausible 
strategy  for  selecting  w  and  y  is  one  that  minimizes,  for  example,  the  expected  misclassification  errors,  and 
which  can  be  formulated  as  the  following  stochastic  programming  problem: 

min  |5iE[(-aTw  +  y  +  1)+1  +  S2E[(bTw  -  y  +  l)+ll, 

(w,y)€t»+>  '  ' 

where  <$ij2  serve  as  “importance”  weights  of  the  misclassification  errors  for  points  of  sets  A  and  23,  respec¬ 
tively.  Further,  instead  of  minimizing  the  expected  misclassification  error,  one  may  select  the  parameters  w 
and  y  so  as  to  minimize  the  risk  of  misclassification.  As  it  is  well  known  in  stochastic  optimization  and  risk 
analysis,  the  “risk”  associated  with  random  outcome  of  a  decision  under  uncertainty  is  often  attributed  to 
the  “heavy”  tails  of  the  corresponding  probability  distribution.  The  risk-inducing  “heavy”  tails  of  probability 
distributions,  are,  in  turn,  characterized  by  the  distribution’s  higher  moments.  Thus,  if  the  misclassifications 
introduced  by  a  separating  hyperplane  can  be  viewed  as  “random”,  the  misclassification  risk  may  be  con¬ 
trolled  better  if  one  minimizes  not  the  average,  or  expected  misclassification  errors,  but  their  moments  of 
order  p  >  1.  This  gives  rise  to  the  following  formulation  for  linear  discrimination  of  sets  A  and  23: 

min  Si  II (— aTw  +  y  +  1)+ II  +  S2|| (bTw  —  y  +  1)+ 1|  ,  p  e  [1,  +oo],  (3) 

(w,y)eR"  +  1  p  p 

where  ||  •  ||p  is  the  usual  Lp  norm:  ||F||p  =  (E|F|/’)1^'°  if  p  e  [1,  oo),  and  HFHoo  =  ess  sup  |7|.  If  a  and  b 
are  uniformly  distributed  with  support  sets  A  and  23,  respectively: 


P(a  =  a,)  =  1/k,  P(b  =  b,  )  =  1/m  forall  a,-  e  A,  b7  e  23, 


the  p- norm  linear  discrimination  problem  takes  the  form 

it 


mm  J.1  tttt  ||  (—Aw  +  cy  +  e)+  +  — —  (Bw  -  ey  +  e)+ 

(w,y)sR"+>  kl/P  p  mi/p  p 


(4) 

(5) 


where  ||  •  ||p  is  a  norm  in  Euclidean  space  of  an  appropriate  dimension:  ||u||p  =  (|wi|p  +  . . .  +  \ui \PY'P, 

p  e  [1,  oo)  and  ||u||oo  =  max,  =  1 . /{m/}  (in  the  sequel,  it  shall  be  clear  from  the  context  whether  the  Cp  or 

Euclidean  p- norm  is  used).  Further,  (5)  can  be  formulated  as  a  p- order  cone  programming  problem  (pOCP) 


min  8\k  l^p  i;  +  S2«?  1^p  t]  (6a) 

s.  t.  £  >  ||y||p,  (6b) 

t?  >  I|z||j7,  (6c) 

y  >  —Aw  +  ey  +  e,  (6d) 

z  >  Bw  —  ey  +  e,  (6e) 

z,  y  >  0.  (6f) 


Note  that  the  special  case  of  p  =  1  and  8 \  =  S2  corresponds  to  the  linear  discrimination  model  of  Bennett 
and  Mangasarian  [4],  The  p- cone  programming  linear  separation  model  (3)-(6)  shares  many  key  properties 
with  the  LP  separation  model  [4],  including  the  guarantee  that  an  optimal  solution  of  (6)  is  non-zero  in  w  for 
linearly  separable  sets. 

Proposition  2  When  sets  A  and  23,  represented  by  matrices  A  and  B,  are  linearly  separable,  the  separating 
hyperplane  w*Tx  =  y*  given  by  an  optimal  solution  of  (5)-(6)  satisfies  w*  /  0. 
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Proof:  Zero  optimal  value  of  (6a)  entails  that  —Aw*  +  ey*  +  e  <  0,  Bw*  —  ey*  +  e  <  0  at  optimality, 
which  requires  that  y*  <  —1  and  y*  >  1  simultaneously  for  w*  =  0  to  hold.  ■ 

Secondly,  the  p- norm  separation  model  (6)  can  produce  a  solution  with  w  =  0  only  in  a  rather  special 
case  that  is  identified  by  Theorem  1  below. 

Theorem  1  Consider  the  p-order  cone  programming  problem  (6)-(5),  where  it  is  assumed  without  loss  of 
generality  that  0  <  5i  <82-  Then,  for  any  p  e  ( 1 , 00)  the  p-order  cone  programming  problem  (6)  has  an 
optimal  solution  with  w*  =  0  if  and  only  if 

eT  x  T  $2 

— A  =  v  B.  where  e  v  =  1,  v  >  0.  v  L<  - -1-,  (7a) 

k  8iml/P 

where  q  satisfies  p~x  +  q~l  =  1  .In  other  words,  the  arithmetic  mean  of  the  points  in  A  must  be  equal  to 
some  convex  combination  of  points  in  '13.  In  the  case  of  Si  =  8  2  condition  (7a)  reduces  to 

eT  eT 

—A  =  —  B,  (7b) 

k  m 

i.e.,  the  arithmetic  means  of  the  points  of  sets  A  and  '13  must  coincide. 


Proof:  First,  let  us  consider  the  case  when  the  p- cone  discrimination  model  (6)  has  an  optimal  solution  with 
w*  =  0  and  demonstrate  that  (7)  must  then  hold.  From  the  formulation  (5)  of  problem  (6)  it  follows  that  in 
the  case  when  w  =  0  at  optimality,  the  corresponding  optimal  value  of  the  objective  (6a)  is  determined  as 


min 

yeM 


Mk"+r) 


•  /p 


+ 


=  25,, 


due  to  the  assumption  0  <  5,  <  82-  Next,  consider  the  dual  of  the  p-cone  programming  problem  (6): 


max  eTu  +  eTv 
s.  t.  —  Atu  +  Btv  =  0. 
eTu  -  eTv  =  0, 

0  <  u  <  — s,  (8) 

0  <  v  <  -t, 

||S||,  <  8ik~l/p, 

||t||9  <  82m~1/p, 


where  q  is  such  that  1  / p  +  1  / q  =  1.  Note  that  (6)  is  strictly  feasible  and  bounded  from  below,  since  for  any 
wo,  yo  and  s  >  0  one  can  select  y0  =  se  +  (— Awo  +  eyo  +  e)+  >  0,  zo  =  ee  +  (Bwo  —  eyo  +  e)+  >  0, 
£0  =  (1  +  e)||yo||p  >  llyollp  >  o,  and  rjo  =  (1  +  £)llzo||p  >  HzolU  >  0  that  are  feasible  to  (6).  Thus,  the 
duality  gap  for  the  primal-dual  pair  of  p-order  cone  programming  problems  (6)  and  (8)  is  zero  [12],  Then, 
from  the  first  two  constraints  of  (8)  we  have  ATu*  =  BTv*  as  well  as  eTu*  =  eTv*,  which,  given  that  the 
optimal  objective  value  of  (8)  is  2<5i,  implies  that  an  optimal  u*  must  satisfy 

eTu*=<5i.  (9a) 


Also,  from  (8)  it  follows  that 

||u*||?  <  8ik~l/p.  (9b) 

Then,  it  is  easy  to  see  that  the  unique  solution  of  system  (9)  is 


3 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


which  corresponds  to  the  point  where  the  surface  (u\  +  . . .  +  m^)1'^  =  S \k  l^p  is  tangent  to  the  hyperplane 
Mi  +  . . .  +  Uk  =  <5i  in  the  positive  of  R*. 

Likewise,  an  optimal  v*  must  satisfy  eTv*  =  Si  and  ||v*  ||9  <  i$2»t_1  ,  but  such  v*  is  not  unique  in  the 

and  v*  in  the  constraint  ATu*  =  BTv* 


case  82/81  >  1.  By  substituting  the  obtained  characterizations  for  u’ 


Si 


of  the  dual,  we  obtain  (7a).  When  Si  =82,  the  optimal  v*  is  unique:  v*  =  — e,  and  yields  (7b). 

m 

To  prove  the  statement  of  the  Theorem  in  the  opposite  direction,  assume  that,  for  instance,  (7a)  holds 
for  certain  u  and  v.  Selecting  u*  =  (Si/k)e,  v*  =  SiV,  and  s*  =  — u*,  t*  =  — v*,  it  is  easy  to  see  that 
(u* ,  v* ,  s* ,  t* )  represents  a  feasible  solution  of  the  dual  problem  (8)  with  the  dual  cost  of  2S 1 .  Similarly,  the 
tuple  (w*,  y*, y*,z*,£*,  rj*),  where  w*  =  0,  y*  =  1,  y*  =  (ey*  +  e)+  =  2e,  z*  =  (— ey*  +  e)+  =  0, 
£*  =  ||y*||p  =  2 kl!p,  rj*  =  ||z*||p  =  0,  represents  a  feasible  solution  of  the  primal  problem  (6)  with  the 
corresponding  objective  value  of  2Si .  Noting  the  zero  duality  gap  for  the  constructed  pair  of  feasible  solutions 
of  (6)  and  (8),  and  recalling  that  the  primal  problem  is  bounded  and  strictly  feasible,  we  immediately  obtain 
that  this  pair  of  primal-dual  solutions  is  optimal  [12],  Hence,  from  (7a)  it  follows  that  an  optimal  solution  of 
(6)  exists  with  w*  =  0.  ■ 


Observe  that  Theorem  1  implies  that  in  the  case  of  Si  =  82 ,  the  /;-norm  discrimination  model  (6)  produces 
a  null  separating  hyperplane  only  when  the  “geometric  centers”  of  the  sets  A  and  13  coincide.  In  practice,  this 
means  that  such  sets  cannot  be  efficiently  separated,  at  least  by  a  hyperplane,  thus  an  occurrence  of  a  w*  =  0 
solution  in  (6)  may  be  regarded  not  as  a  shortfall  of  formulation  (6),  but  rather  as  the  general  unsuitability 
of  such  sets  A  and  13  to  linear  discrimination.  In  the  case  of  Si  <  S2,  occurrence  of  a  w*  =  0  solution 
in  (6)  does  not  necessarily  signify  that  sets  A  and  13  are  hardly  amenable  to  linear  separation.  In  this  case 
Theorem  1  only  claims  that  the  “geometric  center”  of  A  must  lie  within  the  convex  hull  of  set  13,  so  that 
linear  discrimination  can  still  be  a  feasible  approach,  albeit  at  a  cost  of  significant  misclassification  errors. 

In  order  for  a  w*  =  0  solution  to  occur  only  under  the  stricter  condition  (7b)  when  misclassification 
preferences  for  sets  A  and  13  are  different,  the  />-norm  linear  discrimination  model  can  be  extended  by 
applying  norms  of  different  orders  to  misclassifications  of  points  in  A  and  13: 


min  k  11 P1  ||(-Aw  +  ey  +  e)+ 1|  +  m  11 P2 1|  (Bw  -  ey  +  e)+  ||  ,  /?ij2  €  (1, 00).  (10) 

(w,y)€lR"  +  1  Pl  P2 


Intuitively,  a  norm  of  higher  order  places  more  “weight”  on  the  outliers.  For  example,  use  of  p  =  1  norm 
entails  minimization  of  the  average  of  misclassifications;  in  contrast,  application  of  the  p  =  00  norm  implies 
minimization  of  the  largest  misclassification  for  a  set.  Thus,  by  selecting  appropriately  the  orders  p\  and 
P2  in  (10)  one  may  introduce  tolerance  preferences  on  misclassifications  of  points  of  sets  A  and  13.  At  the 
same  time,  it  can  be  shown  that  the  occurrence  of  w*  =0  solution  in  (10)  would  signal  the  presence  of  the 
aforementioned  singularity  about  the  sets  A  and  13.  Namely,  we  have 

Theorem  2  The  p -order  cone  programming  problem  (10),  where  p\,p2  €  (l,oo),  has  an  optimal  solution 
with  w*  =  0  if  and  only  if  (7b)  holds. 

We  conclude  this  section  by  pointing  out  a  connection  between  the  p- norm  separation  model  and  the 
classical  Support  Vector  Machine  (SVM)  model.  SVM  models  are  widely  used  in  classification  problems 
(see  some  recent  works  in,  e.g.,  [5,  9,  14]).  The  linear  SVM  for  non-separable  sets  can  be  written  as  a 


quadratic  programming  problem  of  the  form 

min  ^  ||w||2  +  CieTei  +  C2eTe2  (11a) 

s.  t.  Aw  — ey>e  — si  (lib) 

— Bw  +  ey  >  e  —  e2  (11c) 

81,2  >0  (lid) 


where  81  and  e2  are  misclassification  vectors  for  sets  A  and  13,  respectively,  and  Ci ,  C2  >  0. 

Proposition  3  If  the  misclassification  weight  coefficients  in  the  p-norm  separation  model  (6)  and  the  SVM 
model  (11)  coincide,  C\  =  8\/k\  and  C2  =  <52/k2,  the  optimal  value  LSvm  of  SVM  problem  (11)  can  be 
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bounded  as 


V*  <  V* 

Vp  —  ^SVM 


<  v;  +  i||w*n2, 


where  V*  is  the  optimal  value  of  p-norm  problem  (6)  and  w*  is  an  optimal  solution  of  (6). 
Proof:  By  renaming  variables  e  |  =  y,  e2  =  z,  problem  (1 1)  can  be  rewritten  as 

min  ||||w||2  +  Ci£  +  C2t]  f  >  ||y||i,  rj  >  ||z||i,  (6d),  (6e),  (6f)  j. 


(12) 


Setting  Ci  =  8\/ k,  C2  =  82/111  and  taking  into  account  that  ||x||p  >  ||x||?  for  1  <  p  <  q.  it  is  easy  to  see 
that 

±||w*||2  +  Cx?  +  C2rf  >  ±||w**||2  +  Cir*  +  C2rj** 

>  C,r*  +  C211**  >  c,  £(*}  +  C2  r,*w  >  C^*  +  C2r?, 

where  w**,%**,  rf  *  are  the  optimal  values  of  the  variables  in  the  SVM  problem  (12),  w*.  £*,  tf  are  optimal 
solutions  of  the  p- norm  separation  model  (6),  and  j ,  7/*^  are  optimal  solutions  of  (6)  with  p  =  1.  ■ 

In  the  next  section  we  discuss  the  details  of  practical  implementation  of  the  p- norm  linear  discrimination 
model  (6). 


3  A  second  order  cone  programming  approach  to  p-order  cone  pro¬ 
gramming  problems 

The  />-order  cone  constraints  (6b)-(6c)  are  central  to  practical  implementation  of  the  p- norm  separation 
method  (6).  In  the  special  cases  of  p  =  1  or  p  =  00,  p- order  cone  constraints  reduce  to  linear  inequalities; 
specifically,  the  p  =  1  version  of  model  (6)  has  been  studied  in  [4].  In  general,  the  amenability  of  1- 
norm  to  implementation  via  linear  constraints  has  been  exploited  in  a  variety  approaches  and  applications, 
too  numerous  to  cite  here.  Another  prominent  special  case  of  is  that  of  p  =  2,  when  (6b)-(6c)  represent 
second  order,  or  quadratic  cones.  The  second  order  cone  programming  (SOCP)  constitutes  a  well-developed 
subject  of  convex  optimization,  and  a  number  of  efficient  self-dual  “long-step”  interior  point  (IP)  SOCP 
algorithms  have  been  developed  in  the  literature  and  implemented  in  software  [1,  2,  13].  The  “general” 
case  of  p  e  (1, 2)  U  (2,  00),  when  the  />-cone  is  not  self-dual,  has  received  relatively  limited  attention  in  the 
literature.  IP  approaches  to  /(-order  cone  programming  have  been  considered  in,  e.g.,  [6,  11,  15];  a  polyhedral 
approximation  approach  was  proposed  in  [10]. 

In  this  work,  we  pursue  an  approach  to  solving  /(-cone  programming  problems  that  is  based  on  the 
possibility  to  represent  a  /(-order  cone  via  a  sequence  of  second  order  cones  when  p  is  rational  [1,  12], 
Reformulation  of  a  rational-order  />-cone  programming  problem  as  a  SOCP  problem  allows  for  employing 
the  efficient  self-dual  SOCP  methods,  albeit  at  a  cost  of  a  large  number  of  second  order  cones  required  for 
such  a  reformulation.  Moreover,  since  such  a  reformulation  is  not  unique,  in  Section  3.2  we  introduce  a 
constructive  “economical”  representation  of  rational-order  /(-cones  via  second  order  cones. 

3.1  Representation  of  rational-order  p-cones  with  second  order  cones 

Without  loss  of  generality,  consider  a  /(-cone  in  the  positive  orthant  of  R"+1 

t  >  (iff  +  . . .  +  w%)llp,  (t,  wi, . . . ,  wn)T  >  0.  (13) 

In  the  case  when  the  parameter  p  is  a  positive  rational  number,  p  =  r/s ,  where  r,  s  e  N,  then,  for  instance, 
the  following  “lifted”  representation  of  the  p- cone  set  (13)  can  be  constructed  in  [1,  10]: 

t  >ui  +  ...  +  u„,  uj  >  0,  j  =  1 _ ,77,  (14a) 

wf  <  uSjtr~swR-r,  j  =  1, ...  ,77,  (14b) 
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where  R  =  2P,  p  =  |"log2  r] .  Then,  each  nonlinear  inequality  (14b)  can  equivalently  be  replaced  by  a 
sequence  of  three-dimensional  (3D)  rotated  quadratic  cones  z2  <  xy,  such  a  representation,  however,  is  not 
unique.  Observe  that  each  side  of  inequalities  (14b)  contains  2P  factors;  this  allows  one  to  construct  a  lifted 
representation  for  (14b)  via  2P  —  1  3D  rotated  quadratic  cones  using  the  “tower  of  variables”  technique  [3]: 


W2  <  It p-1,1  Up- 1,2 

(15a) 

vf  i  <  Vl-U2i-l  It/— !,2i , 

i  =  \,...,2p~l. 

/  =  2,...,p-  1, 

(15b) 

vlt  <  u2. 

i  =  !••••,  Ia/2J, 

(15c) 

v2t  <  Ut, 

i  =  [s/ 2J  +  1  — 

J-v/2], 

(15d) 

vh  <  l2, 

i  =  \s/ 2]  +  1  — 

,Lr/2J, 

(15e) 

vf,i  <tw. 

i  =  I//2J  +  1,  •  •  • 

,\r/ 21, 

(15f) 

vli  <  w2, 

w,  vpj,u,  t  >  0, 

i  =  |>/2]  +  1, . . . 

.L*/2J, 

(15g) 

where  subscripts  j  are  suppressed  for  brevity.  The  set  of  inequalities  (15)  can  be  visualized  as  a  binary  tree 
whose  nodes  represent  the  variables  in  (15).  Each  inequality  in  (15)  can  then  be  viewed  as  a  subgraph  with 
two  arcs  that  connect  the  “parent”  node  (the  variable  at  the  left-hand  side  of  the  inequality)  to  the  two  “child” 
nodes  (the  variables  at  the  right-hand  side  of  the  same  inequality).  Given  this  binary  structure,  the  set  of 
second  order  cones  in  (15)  can  be  regarded  as  partitioned  into  p  levels  indexed  by  /,  where  the  variable  w 
in  (15a)  constitutes  the  root  node  of  the  tree,  and  belongs  to  p-level,  while  variables  u,t ,  w  in  (15d)-(15g) 
represent  the  leaf  nodes,  or  0-level  nodes  of  the  tree. 

In  [10]  it  has  been  shown  that  among  the  2P  —  1  inequalities  (15)  there  are  only  O(p)  =  (3(log2  r)  non¬ 
degenerate  second  order  cones,  while  the  rest  reduce  to  linear  inequalities  that  can  be  omitted.  The  following 
bounds  on  the  number  of  non-degenerate  quadratic  cones  in  (15)  follow  directly  from  the  arguments  in  [10]; 

Proposition  4  ([10])  When  p  is  a  positive  rational  number,  p  =  r/s,  such  that  r  >  s  and  the  greatest 
common  divisor  of  r  ands  is  1,  a  p-order  cone  in  the  positive  orthant  of'E"1  1  can  equivalently  be  represented 
by  Cp  three-dimensional  quadratic  cones,  where  Cp  satisfies 

np  <  Cp  <  n(2p—  1),  p  =  |"log2  r ] .  (16) 

It  it  easy  to  see  that  the  order  in  which  the  variables  it,  t,  and  w  are  assigned  to  the  leaf  nodes  in  the 
binary  tree  (15)  can  significantly  affect  the  number  of  non-degenerate  quadratic  cones  needed  to  represent  a 
rational-order  p- cone  in  R"  +  1.  As  an  illustration,  consider  the  case  p  =  3;  direct  application  of  (15)  yields 
p  =  2,  R  =  4,  and  a  representation  of  p  =  3  cone  (13)  that  involves  3 n  3D  rotated  quadratic  cones: 

t  >u i  +  . . .  +  un\  w2  <  v\jV2j,  v2 j  <  Ujt,  v2 j  <  twj,  j  =  1 (17) 

On  the  other  hand,  it  is  easy  to  verify  that  reordering  the  leaf  nodes  inequalities  (15c)-(15g)  allows  for 
reducing  the  number  of  3D  quadratic  cones  necessary  to  represent  a  p  =  3  cone  in  IP."1' 1  to  2tv. 

t  >  ui  +  . . .  +  u„;  Wj<tVj,  vj  <UjWj,  y  =  l, (18) 

Observe  that  the  number  of  second  order  cones  in  representations  (17)  and  (18)  correspond  to  the  upper  and 
lower  bounds  in  (16),  respectively. 

Since  a  reduction  in  the  number  of  second  order  cone  inequalities  in  (15)  leads  to  a  reduction  in  the 
number  of  quadratic  cones  representing  a  rational-order  p-cone  (13)  by  the  order  of  dimensionality  n  of  the 
p-cone,  it  is  of  interest  to  devise  an  “economical”  second  order  cone  representation  of  rational-order  cones. 


3.2  An  “economical”  representation  of  rational-order  p-cone  via  second  order  cones 


Below  we  demonstrate  that  the  lower  bound  on  Cp  in  (16)  is  achievable  for  any  rational  p  >  1.  To  this  end, 
consider  the  following  convex  pointed  cone  in  IP^  : 


9  = 


kr\  k  i  ko  ^ 

v0H  -Ti  V2  Ts  < 


(19) 
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that  satisfies  the  next  four  properties: 

(PI)  k0,  k\,  k2,  k3  €  Z+; 

(P2)  /to  =  k\  +  /f2  +  A'3; 

(P3)  /ti  +  k2  +  /t3  =  2q  for  some  integer  q  >  1; 

(P4)  exactly  two  numbers  among  k\ ,  k2,  and  k 3  are  odd. 

Proposition  5  Cone  CP  (19)  that  satisfies  (P1)-(P4)  can  be  represented  as  an  intersection  of  at  most  q  three- 
dimensional  cones  of  the  form  {x  e  |  x\  <  X\X2  }. 

Proof:  The  process  of  building  such  a  representation  of  CP  is  based  on  successive  lifting  of  CP  into  spaces  of 
dimensions  greater  than  previous  by  1,  in  such  a  way  that  the  degree  of  the  polynomial  in  (19)  is  reduced  in 
half  each  time.  First,  assume  that  ki,k3,  £3  >  0  are  all  different,  and  q  >  2.  Without  loss  of  generality,  let 
k\ ,  k2  be  odd  and  such  that  Zt2  >  k\,  and  consider  the  following  set  in 

=  { y  e  rn.%  I  yvo°  -  y^ypyp  <  o,  y42  <  J1T2 } .  (20) 

where  Vo  =  /to/2,  r>2  =  0<2  —  k\)/2 ,  v4  =  k\,  V3  =  £3/2. 

It  is  easy  to  see  that  any  (yo,  •  •  • ,  ^3)  e  CP  can  be  extended  to  (yo, ....  y4)  e  CP*,  and  any  (yo, . . . ,  y4)  e  CP* 

is  such  that  (yo _ ,73)  €  CP.  As  k\  and  k2  are  odd  and  positive  integers  by  assumption,  due  to  (P4)  k2 

is  even,  whence  V3  is  a  positive  integer.  The  above  assumption  also  implies  that  k2  —  k  1  is  even,  meaning 
that  v2  is  a  positive  integer.  Similarly,  Vo  is  integer  and  t>o  =  2<?~1.  Also,  observe  that  Vi  +  V2  +  V3  = 
(A'l  +  k2  +  k3)/2  =  /to/2  =  Vq.  So,  the  first  cone  in  (20)  satisfies  properties  (P1)-(P3).  Next,  observe  that 
v4  =  k\  is  odd,  thus  out  of  two  integers  V2,  Vj  exactly  one  should  be  odd  for  v2  +  v3  +  u4  =  2?_1  to  hold. 
Thus,  condition  (P4)  holds  as  well. 

Note  that  if  in  our  assumption  k\  =  k2,  then  U2  =  0  in  (20),  but  all  conditions  still  hold.  Consider  the 
case  when  q  >  2  and  one  of  k\,k2,  k  3  is  zero,  assume  it  is  k2.  Then  k\ ,  k2  should  be  odd  by  (P4).  Performing 
the  same  transformation,  we  obtain 

CP**  =  {y  e  I  To0  -  y\4yV2  -  °>  yl-  T1T2  }  -  t’o  =  k0/2 ,  v2  =  (k2  -  /fi)/2,  v4  =  kx.  (21) 

The  first  cone  of  CP**  still  has  properties  (P1)-(P4),  and  (yo . y3)  e  CP  can  be  extended  to  (yo, . . . ,  y4)  e 

CP**,  and  any  (yo, . . . ,  y4)  e  CP**  is  such  that  (yo, ....  V3)  e  CP. 

If  q  =  1,  then  one  of  k\,  k2,  k^  is  zero,  and  two  others  are  necessarily  equal  to  1.  In  this  case  CP  is  already 
a  quadratic  cone.  Thus,  the  above  lifting  transformation  can  be  carried  out  no  more  than  q  —  1  times,  and  the 
conic  set  CP  (19)  can  be  represented  by  at  most  q  quadratic  cones  using  at  most  q  —  1  new  variables.  ■ 

With  the  help  of  Proposition  5  we  can  now  establish  the  following  result  on  second  order  cone  represen¬ 
tation  of  rational-order  y-concs: 

Theorem  3  Let  p  >  1  be  a  positive  rational  number,  p  =  r/s,  where  the  greatest  common  divisor  of  r  and 
s  is  1.  Then  a  p-order  cone  in  the  positive  orthant  of  M"  +  1  can  equivalently  be  represented  by  n  [log2  r] 
three-dimensional  rotated  quadratic  cones. 

Proof:  In  accordance  to  (13)-(14b),  the  problem  of  representing  a  (r/.v)-cone  in  M/  f  1  via  second  order 
cones  can  be  reduced  to  finding  a  second  order  cone  representation  of  n  sets  of  the  form 

Q  =  { y  €  R3+  I  yf  -  y\yr2~sytr  <  0  J ,  (22) 

where  R  =  2P,  p  =  [ log2  r] .  Observe  that  cone  Q  is  equivalent  to  intersection  of  cone  CP  (19),  where  k\  =  s, 
k2  =  r  —  s,  k$  =  R  —  r,  with  a  hyperplane  y0  =  V3.  Indeed,  properties  (P1)-(P3)  are  obvious,  and  (P4) 
holds  since  if  r  and  .s'  do  not  have  common  divisor  greater  than  1,  neither  do  r  —  s  and  .v,  whereby  r  —  s  and 
s  cannot  be  both  even. 

Note  that  an  iteration  of  the  lifting  procedure  described  in  Proposition  5  corresponds  to  a  specific  order 
in  which  the  variables  at  some  level  of  the  binary  tree  are  arranged.  For  example,  the  first  iteration  of  lifting 
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corresponds  to  arranging  the  0-level  variables  {w,t,u}  =  { Vi ,  y2,  >'3  <  in  pairs  corresponding  to  second  order 
cone  constraints,  such  that  y  \  and  y2  make  k\  pairs,  or  y\  <  V;  >'2  non-degenerate  cones;  the  remaining 
k2  —  k  1  variables  >>2  form  (£2  —  k\)/2  pairs,  or  degenerate  cones  y'4  <  and  variables  V3  form  k^/2 
pairs,  or  degenerate  cones  y'j2  <  _y|,  assuming  that  k\  <  k2  are  odd.  Obviously,  the  degenerate  cones  can 
simply  be  disregarded. 

Hence,  by  Proposition  5,  Q  admits  representation  by  at  most  p  =  [log2  r]  second  order  cones;  combining 
this  with  Proposition  4,  one  obtains  that  each  of  n  sets  of  the  form  Q  admits  representation  using  exactly 
p  =  [log 2  r]  second  order  cones.  ■ 

It  is  well  known  that  second  order  cone  sets  admit  an  equivalent  semidefinite  representation  in  the  form 
of  linear  matrix  inequalities  (LMIs).  In  general,  p -order  cones  are  not  LMI-representable  in  the  space  of 
original  variables  (see  an  example  for  p  =  4  cone  in  [7,  8]),  but  admit  lifted  LMI  representations. 

Corollary  1  Conic  set  Q  (22)  admits  a  lifted  representation  in  the  form  of  LMI 

P+2 

£  A  ,7i  >  0 

1  =  1 

where  A ,•  e  R2px2p  are  symmetric  matrices,  in  the  sense  that  the  projection  of  Q*  onto  the  space  of  variables 
(y x,  y2,  >>3)  coincides  with  Q. 

4  Computational  study 

In  this  section  we  report  computational  results  on  using  the  p-norm  discrimination  model  (5)-(6)  for  linear 
separation  of  sets.  In  particular,  we  employ  the  presented  above  “economical”  SOCP  reformulation  approach 
to  solving  pOCP  problem  (6)  in  the  case  when  p  is  rational,  and  compare  it  with  the  polyhedral  approximation 
technique  of  [10]. 

In  our  computational  experiments  we  used  three  data  sets  from  UCI  Machine  Learning  Repository.  The 
first  data  set  is  Wisconsin  Breast  Cancer  data  set  with  a  total  of  683  instances  and  9  attributes.  It  contains  444 
instances  with  benign  diagnosis  and  239  instances  with  malignant  diagnosis.  The  second  data  set,  Cleveland 
Heart  Disease  data  set,  contains  281  instances  with  13  attributes,  of  them  125  instances  correspond  to  positive 
diagnosis  and  156  instances  correspond  to  negative  diagnosis.  Finally,  the  Pima  Indians  Diabetes  data  set 
reports  768  instances  with  8  attributes,  including  266  instances  of  positive  diagnosis  and  502  instances  of 
negative  diagnosis.  Both  the  Wisconsin  Breast  Cancer  and  Cleveland  Heart  Disease  data  sets  (in  their  then- 
up-to-date  versions)  were  used  in  [4], 

For  each  data  set,  training  and  testing  was  performed  by  randomly  selecting  100  training  sets  with  equal 
number  of  points  of  both  types,  and  testing  the  obtained  separator  on  the  data  not  included  in  the  training 
set.  For  computational  purposes,  the  data  in  training  data  sets  was  normalized  and  scaled  by  a  factor  of 
104;  the  same  transformation  was  then  applied  to  testing  data.  After  the  training  and  testing  procedures 
were  performed,  the  average  misclassification  error  on  testing  set  was  computed.  It  is  important  to  comment 
on  selection  of  parameter  p  in  (6):  as  a  general  rule  that  follows  from  our  numerical  experiments  and  is 
consistent  with  the  motivation  presented  in  Section  2,  smaller  values  of  p  (around  p  =  2)  are  beneficial 
for  well-separable  data  sets  with  smaller  misclassification  errors,  whereas  larger  values  of  p  >  3  allow  for 
reducing  large  misclassification  errors  in  linear  separation.  With  this  in  mind,  a  particular  value  of  p  can  be 
selected  during  the  training  procedure. 

Table  1  reports  the  average  out-of-sample  misclassification  error  for  each  data  set,  together  with  the 
respective  “best”  value  of  p  at  which  this  error  was  obtained.  It  also  includes  results  for  the  cases  of  p  =  1, 
which  corresponds  to  minimization  of  the  average  of  misclassifications  due  to  [4],  p  =  00,  corresponding 
to  minimization  of  the  largest  misclassification  errors,  and  SVM  model  (11).  Figures  1,  2,  and  3  illustrate 
the  behavior  of  the  misclassification  error  in  the  described  data  sets  with  respect  to  the  value  of  parameter 
p  in  (5)-(6),  which  was  varied  in  the  range  of  1.0  to  4.0  with  a  0.1  step.  As  it  follows  from  Table  1  and 
Figures  1-3,  the  /3-norm  separation  model  (5)— (6)  with  p  >  I  allows  for  an  improved  classification  accuracy 
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as  compared  to  the  cases  of  p  =  1  proposed  in  [4],  the  SVM  model  (11),  and  the  worst-error  approach  of 
p  =  oo. 

In  addition  to  classification  capabilities  of  the  p- norm  linear  separation  model  (5)-(6),  its  computational 
properties  were  investigated.  In  particular,  for  all  the  data  sets  described  above  we  compared  the  running 
times  of  the  cutting  plane  procedure  for  polyhedral  approximations  of  problem  (6)  due  to  [10],  denoted  as 
LP/CP,  and  the  “economical”  SOCP  reformulation  of  (6),  along  with  the  corresponding  results  for  SVM 
model  (11)  and  p  =  oo  case.  All  models  were  coded  in  C++  and  CPLEX  12.2  solver  was  used  to  solve  the 
resulting  LP,  SOCP,  and  QP  problems.  A  dual-core  3GHz  CPU  computer  with  2GB  of  RAM  was  used  to  run 
the  computations.  Figure  4  illustrates  corresponding  running  times  on  the  example  of  the  Wisconsin  Breast 
Cancer  data  set,  along  with  the  values  of  the  parameter  p  =  [log2r],  where  p  =  r/s ,  which  is  propor¬ 
tional  to  the  number  of  second  order  cones  in  the  SOCP  reformulation  of  rational-order  p-cone  programming 
problem  (6).  From  Figure  4  it  follows  that  the  solution  times  for  SOCP  reformulation  of  a  rational-order  p- 
cone  programming  model  (6)  are  highly  correlated  to  the  number  of  second  order  cones  in  the  reformulated 
problem.  On  the  other  hand,  solution  times  of  a  polyhedral  approximation  of  (6)  solved  with  a  cutting  plane 
method  (LP/CP)  exhibit  relatively  little  dependence  on  the  value  of  the  parameter  p,  and  are  competitive  with 
the  running  times  of  the  SVM  model.  Computational  performance  of  the  considered  models  on  other  data 
sets  is  very  similar  to  that  presented  in  Figure  4. 

Table  1:  Classification  results  for  different  data  sets:  the  lowest  average  misclassification  error,  the  corre¬ 
sponding  value  of  p,  and  misclassification  error  for  the  cases  of  p  =  1,  p  =  oo,  and  SVM  model  (11). 


Dataset 

Error 

Best  p 

P  =  1 

SVM 

p  =  oo 

Wisconsin  Breast  Cancer  Dataset 

3.95% 

1.8 

4.11% 

4.03% 

4.21% 

Cleveland  Heart  Disease  Dataset 

18.7% 

3.8 

19.5% 

18.98% 

19.11% 

Pima  Indians  Diabetes  Dataset 

31.82% 

3.4 

35.29% 

34.02% 

33.51% 

1  1.5  2  2.5  3  3.5  4 

P 


Figure  1:  Misclassification  error  as  a  function 
of  p  for  Wisconsin  Breast  Cancer  data  set. 


Figure  3:  Misclassification  error  as  a  function 
of  p  for  Pima  Indians  Diabetes  data  set. 


1  1.5  2  2.5  3  3.5  4 


P 

Figure  2:  Misclassification  error  as  a  function 
of  p  for  Cleveland  Heart  Disease  data  set. 


Figure  4:  Average  running  time  for  instances  of 
Wisconsin  Breast  Cancer  data  set. 
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Abstract 

In  this  paper,  a  branch-and-bouncl  algorithm  for  finding  all  cliques  of  size  k  in  a  k- 
partite  graph  is  proposed  that  improves  upon  the  method  of  Grunert  et  al  (2002).  The 
new  algorithm  uses  bit-vectors,  or  bitsets,  as  the  main  data  structure  in  bit-parallel 
operations.  Bitsets  enable  a  new  form  of  data  representation  that  improves  branching 
and  backtracking  of  the  branch-and-bound  procedure.  Numerical  studies  on  randomly 
generated  instances  of  fc-partite  graphs  demonstrate  competitiveness  of  the  developed 
method. 

Keywords:  maximum  clique  enumeration  problem,  fc-partite  graph,  fc-clique,  bit 
parallelism 


1  Introduction 

Given  an  (undirected)  graph  G  =  (V,  E),  where  V  is  set  of  nodes  and  E  is  the  set  of  arcs, 
a  clique  in  G  is  defined  as  a  complete  subset  of  G,  i.e.,  a  set  of  nodes  in  V  that  are  pairwise 
adjacent.  A  clique  of  size  k  is  called  k- clique-,1  the  largest  clique  in  a  graph  is  called  the 
maximum  clique  and  its  size  is  denoted  by  oj(G).  Note  that  G  may  contain  several  cliques 
of  size  t o(G).  Closely  related  to  the  concept  of  a  clique  is  that  of  an  independent  set  of  G, 
defined  as  an  induced  subgraph  of  V  whose  nodes  are  pairwise  disjoint. 

The  Maximum  Clique  Problem  (MCP)  consists  in  finding  the  largest  clique  in  a  graph, 
and  is  of  fundamental  importance  in  discrete  mathematics,  computer  science,  operations 

‘Corresponding  author. 

1It  is  worth  noting  that  the  term  k-clique  is  used  in  several  different  contexts  in  the  literature;  for 
instance,  one  of  its  alternative  interpretations  is  that  of  a  subgraph  where  any  two  nodes  are  connected  by 
a  path  of  length  at  least  k  [10].  In  this  work,  we  use  the  definition  of  fc-clique  as  given  above. 
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research,  and  related  fields  [1].  In  many  applications  it  is  of  interest  to  identify  all  max¬ 
imum  cliques  in  a  graph.  This  problem  is  known  as  the  Maximum  Clique  Enumeration 
Problem  (MCEP).  In  the  present  work,  we  consider  a  special  case  of  the  MCEP,  concerned 
with  finding  all  ^-cliques  in  a  /c-partite  graph.  A  graph  G  =  (V,  E)  is  called  /c-partite  if 
the  set  of  nodes  V  can  be  partitioned  into  k  independent  sets,  or  partites  Vr,  r  =  1, . . . ,  k: 

k 

V  =  [J  Vr,  Vr  n  Vs  =  0,  r^s,  such  that  for  all  i,j  £  Vr  :  ( i,j )  ^  E.  (1) 

r= 1 

Clearly,  one  has  that  uj(G)  <  k  in  a  /c-partite  graph  G,  since  the  maximum  clique  cannot 
contain  more  than  one  node  from  each  independent  set  Vr.  Note  also  that  the  problem 
of  finding  all  /c-cliques  in  a  /c-partite  graph  is  not  equivalent  to  MCEP  since  it  does  not 
account  for  maximum  cliques  with  u(G)  <  k. 

The  problem  of  finding  /c-cliques  in  /e-partite  graphs  has  applications  in  many  areas 
of  science  and  engineering,  including  textile  industry  [3],  where  the  braiding  problem  can 
be  reduced  to  the  problem  of  finding  /c-cliques  in  the  path  compatibility  graph  that  rep¬ 
resents  a  /c-partite  graph;  data  mining,  particularly  for  clustering  of  categorical  attributes 
over  /e-domains  [12];  identification  of  protein  structures  [9],  where  protein  interaction  net¬ 
work  is  represented  by  a  /c-partite  graph  that  is  mined  for  /c-cliques.  Recently,  it  has 
been  shown  that  the  problem  of  finding  /c-cliques  in  /c-partite  graphs  can  be  used  to  find 
high-quality  solutions  of  large-scale  randomized  instances  of  multidimensional  assignment 
problem  (MAP)  [6,  7,  11]. 

Grunert  et  al  [3]  proposed  branch- and-bound  algorithm  FIND  CLIQUE  for  the  problem 
of  finding  all  /c-cliques  in  /c-partite  graphs,  which  takes  as  an  input  a  graph  G  =  (V,E), 
where  V  satisfies  (1),  and  produces  the  set  Q  of  Zc-cliques  contained  in  G  as  an  output. 
FINDCLIQUE  is  a  recursive  method,  such  that  level  t  of  recursion  corresponds  to  the 
level  t  of  branch-and-bound  tree,  which  in  turn,  is  associated  with  the  t- th  partite  that 
is  branched  on  in  V.  Starting  at  the  root  (t  =  0)  of  the  branch-and-bound  tree  with  a 
partial  solution  S  =  0,  at  each  step  of  branch-and-bound  procedure  a  node  is  added  to  or 
removed  from  S  until  S  amounts  to  a  /c-clique  in  G ,  i.e.,  |5|  =  k,  or  it  is  verified  that  G 
contains  no  /c-cliques,  oj(G)  <  k. 

Let  B  =  {1, . . . ,  k}  be  the  index  set  of  partites  in  G,  V  =  Ube£  and  denote 
the  set  of  partites  that  have  a  node  in  S : 

Bs  =  {beB\vbnS^H}}. 

Given  a  partial  solution  S,  a  node  is  called  compatible  if  it  is  adjacent  to  all  the  nodes  in 
S'  the  set  of  compatible  nodes  w.r.t.  S  is  denoted  by  Cs- 

Cs  =  {i  eV  \  (i,j)  £  E  Vj  £  S}. 
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The  set  Cg  is  further  partitioned  into  subsets  containing  nodes  from  the  same  partite: 

Cs=  \J  CStb, 

beBs 

where  Bg  =  B  \  Bg,  and  Cgj,  C  Vb  is  given  by 

CS,b  =  \J(Vbn  N(s)), 

ses 

with  N(s)  being  the  set  of  nodes  adjacent  to  node  s. 

At  the  root  node  of  the  branch-and-bound  tree  (t  =  0),  one  has  S  =  0,  B  =  Bg  = 
{1, . . . ,  k},  Bg  =  0,  and  Cg_b  =  Vb  for  all  b  G  B.  At  a  level  t  of  the  branch-and-bound 
tree,  bt  G  Bg  is  selected  as  the  partition  to  branch  on.  In  order  to  achieve  the  greatest 
reduction  in  the  size  of  the  branch-and-bound  tree  when  pruning,  bt  is  selected  as  the 
partition  with  the  smallest  number  of  nodes: 

bt  G  arg min {|Csj6 1  |  b  G  Bs}.  (2) 

b 

As  long  as  there  is  a  node  nt  G  Cg^t  that  is  not  traversed,  the  search  process  is  restarted 
from  this  point  with  S  :=  S  U  {nt}  as  the  new  partial  solution.  To  this  end,  the  set  Cg  of 
compatible  nodes  is  updated  with  respect  to  S  U  {nt}: 

Cs,b  ■=  CS)b  PI  N(nt)  for  all  b  G  Bg.  (3) 

Maintaining  the  sets  Cs)b  of  nodes  compatible  with  the  current  partial  solution  S'  is  a  key 
aspect  of  the  algorithm,  thus  for  backtracking  purposes  the  nodes  that  are  removed  from 
Cg:h  during  (3)  are  added  to  the  set  C  =  |Jt=i  which  is  similarly  partitioned  into  k 
levels  Ct,  each  level  corresponding  to  level  t  of  the  branch-and-bound  tree.  In  other  words, 
Ct  contains  the  nodes  in  Cg)b  that  are  not  adjacent  to  node  nf. 

Ct  =  {i  G  Cgb  |  ( i,nt )  ^  E,  b  G  Bg}. 

Obviously,  after  this  step,  Cgjbt  =  0.  A  subproblem  with  a  partial  solution  S  is  promising 
if  all  of  the  partitions  in  Cg  that  do  not  share  a  node  in  the  partial  solution  are  nonempty: 

| <7^6 1  >  0  for  all  b  G  Bg,  b  /  bt.  (4) 

Let  P  be  the  number  of  partitions  Cgg,  C  Cg  that  contain  at  least  one  node;  then,  an  upper 
bound  on  the  size  of  the  largest  clique  containing  S  is  given  by  |Sj  +  P.  If  |Sj  +  P  =  k, 
the  current  subproblem  is  feasible,  meaning  S  may  be  part  of  a  fc-clique.  For  a  feasible 
subproblem,  the  algorithm  traverses  deeper  into  the  branch-and-bound  tree,  t  :=  t  +  1, 
and  a  new  subproblem  is  created. 


3 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


Accordingly,  a  subproblem  with  partial  solution  S  is  pruned  if 

|5|+P<fc,  (5) 

i.e. ,  there  exists  no  clique  of  size  k  that  contains  S.  For  a  nonpromising  subproblem,  set 
Cs.bt  is  restored  by  moving  the  nodes  in  Ct  back  to  Cs,  Cs  :=  CsUCf.  The  last  operation 
implicitly  requires  that  the  nodes  from  Ct  are  put  back  into  the  partitions  of  Cs  that  they 
were  removed  from: 

cS,*(y)  :=  Cs,n{v)  U  V  for  all  v  €  Ct,  (6) 

where  7 r(z)  is  the  index  of  the  partite  that  node  i  belongs  to:  i  G  Vn^y,  moreover,  the 
relative  orders  of  nodes  in  the  partites  Vf,  should  be  preserved  in  Cs.b,  given  that  the 
nodes  in  G  are  assumed  to  be  ordered/numbered. 

The  search  process  is  then  restarted,  provided  that  there  exists  a  node  in  partition 
Cs.bt  that  is  not  traversed.  If  there  is  no  such  node,  FINDCLIQUE  returns  to  the  previous 
level  t  —  1  of  the  branch-and-bound  tree. 

2  A  bitwise  algorithm  for  finding  ^-cliques  in  a  fc-partite 
graph 

In  this  section,  we  present  an  algorithm,  referred  to  as  BitCLQ,  for  the  /c-clique  enumer¬ 
ation  problem  in  a  fc-partite  graph,  which  improves  upon  the  FINDCLIQUE  algorithm 
of  Grunert  et  al  [3]  by  introducing  bitset  data  structures  and  utilizing  bit  parallelism  for 
updating  the  set  of  compatible  nodes  and  improving  backtracking. 

2.1  Bitsets 

Bitsets  are  essentially  binary  vectors,  or  sequences  of  bits,  and  as  such  can  be  utilized 
efficiently  in  computer  codes.  Particularly,  bitsets  are  useful  for  storing  adjacency  matrices 
of  graphs,  or  specific  subsets  of  ordered  sets.  For  example,  in  a  graph  on  six  nodes 
{ui, . . . ,  ve}  =  V,  a  clique  with  nodes  v\,  V2,  v$,  v$  can  be  represented  by  a  bitset  {111010}, 
where  each  bit  corresponds  uniquely  to  a  node  in  the  graph,  with  the  significant  bits 
(i.e.,  bits  equal  to  1)  indicating  the  nodes  in  the  clique.  Bit  parallelism  is  a  form  of 
parallel  computing  that  achieves  computational  improvements  by  representing  the  problem 
data  in  bitsets  of  size  R,  where  R  is  the  machine  word  size  (e.g.,  32  or  64),  such  that 
they  can  be  processed  together  within  a  single  processor  instruction.  Bit  parallelism  has 
been  successfully  used  in  many  computational  algorithms,  particularly  for  string  matching 
[2,  4,  5].  Recently,  bit  parallelism  has  been  employed  for  solving  hard  combinatorial 
problems,  such  as  SAT  [14]  and  the  Maximum  Clique  Problem  [13]. 

In  the  present  work,  bit  parallelism  is  used  to  improve  the  computational  procedure  for 
updating  the  set  of  compatible  nodes  in  (3),  and,  moreover,  to  achieve  faster  backtracking 
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by  eliminating  the  need  for  set  C.  In  addition,  use  of  bitsets  allows  for  improvements  in 
memory  storage  efficiency  for  problem  data  structures,  such  as  the  set  of  compatible  nodes 
and  the  adjacency  matrix  of  the  graph. 

Of  particular  significance  in  the  context  of  the  present  work  is  the  operation  of  indexing 
the  first  significant  bit  in  a  bitset,  also  known  as  the  forward  bit  scanning.  One  of  the 
techniques  for  this  purpose  relies  on  use  of  the  De  Bruijn  sequence  with  a  perfect  hash 
table  [8].  The  value  to  be  looked  up  in  the  hash  table  is  given  by  Hr  below: 

Hr  :=  (iA  -x)D  >  (R-log2R),  (7) 

where  x  is  the  bitset  for  which  the  first  significant  bit  has  to  be  indexed,  D  is  an  instance 
of  De  Bruijn  sequence,  R  is  the  machine  word  size,  and  3>  stands  for  the  binary  shift 
right  operator.  Hr  is  effective  for  bitsets  of  maximum  size  equal  to  R.  For  larger  bitsets, 
special  containers  need  to  be  devised.  The  hash  table  required  to  look  up  the  value  of  Hr 
is  created  based  on  the  particular  De  Bruijn  sequence  used  in  (7). 

Note  that  in  (7)  multiplication  is  performed  modulo  R  and  only  the  last  log 2  7?  bits 
of  the  result  will  be  retained.  More  details  on  forward  bit  scanning  and  the  specification 
of  the  De  Bruijn  sequence  used  in  (7)  can  be  found  in  [8]. 


2.2  BitCLQ 


Below  we  present  a  modification  of  FINDCLIQUE,  which  we  refer  to  as  BitCLQ,  that  uses 
bitset  data  structures  and  bit  parallelism  for  keeping  track  of  the  nodes  in  G  that  are  com¬ 
patible  to  the  current  partial  solution  S,  while  simultaneously  reducing  the  computational 
cost  of  backtracking. 

To  this  end,  we  introduce  a  set  Z  consisting  of  k  levels,  Z\, . . . ,  Z^.  Each  of  these 
k  levels  will  be  used  to  represent  the  compatible  nodes  to  the  partial  solution  S  at  the 
t-th  level  of  the  branch-and-bound  tree,  where  1  <  t  <  k.  Every  level  in  Z  is  further 
partitioned  into  k  sets,  each  corresponding  to  a  partite  Vj,  in  G: 

zt=  U  zt,bi  t  =  l,...,k. 

beB 


The  sets  Zt^  are  represented  by  bitsets  of  size  |Vj,|.  Let  Zt^i  be  the  7-th  bit  in  Zt^ 
corresponding  to  the  i-th  node  in  V&,  such  that  Zt)b,i  =  1  if  the  i-th  node  in  Vj,  is  compatible 
with  all  the  nodes  in  the  partial  solution  S  at  the  f-th  level  of  the  branch-and-bound  tree 
in  BitCLQ: 


f  1,  if  (i,j)  G  E  for  all  j  G  St ; 
\  0,  otherwise. 


Clearly,  each  level  Zt  of  Z  is  an  ordered  set  of  combination  of  bitsets  with  the  total  size 
|Vj.  Further,  the  adjacency  matrix  M  of  graph  G  is  stored  in  the  bitset  form,  with  the 
convention  that  the  i-th  row  (column)  corresponds  to  the  i-th  bit  in  Zt,  t  =  1, . . . ,  k. 
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BitCLQ  is  initialized  by  setting  t  :  =  0,  S  :  =  0,  B  =  Bs  :=  {1, . . . ,  k},  and  Q  :=  0, 
where  Q  is  the  set  of  all  fc-cliques  in  G.  Note  that  since  at  the  beginning  all  the  nodes  in 
G  can  be  added  to  S  to  extend  its  size,  all  the  bits  in  Z\  are  significant: 

Z\  b,i  =  1  for  all  b  G  B(St ),  i  G  Vj,. 

At  level  t  of  the  branch-and-bound  tree,  the  partition  bt  to  branch  on  is  selected  as 

bt  G  argmin{|Zti6|  |  b  G  Bs},  (8) 

b 

where  \Zfb\  is  dehned  as  the  number  of  significant  bits  in  the  bitset  Zf  f,.  The  forward 
bit  scanning  method  discussed  in  Section  2.1  is  used  to  identify  node  nt  G  Vbt  that  has 
not  been  traversed  and  thus  can  be  added  to  the  partial  solution.  As  long  as  such  a  node 
exists  in  V}t,  the  search  process  is  restarted  with  S  :=  Su{nt}  as  the  partial  solution,  and 
the  corresponding  bit  in  Ztj)t  is  set  to  0. 

Utilizing  bitsets  also  facilitates  the  process  of  updating  the  compatible  nodes:  when  nt 
is  added  to  partial  solution,  Zt+\  is  created  by  performing  a  logical  AND  operation  with 
Zt  and  the  row  M(n<)  of  the  adjacency  matrix  corresponding  to  the  node  rq  as  operands: 

Zt+i  =  Zt  A  M(nt).  (9) 

Similarly  to  FINDCLIQUE,  let  P  denote  the  number  of  partitions  Zfb  with  \Zt$\  >  0  at 
level  the  t  of  the  branch-and-bound  tree.  If  |Sj  +  P  =  k,  the  current  partial  solution  is 
promising,  so  that  a  new  subproblem  is  created,  and  BitCLQ  proceeds  one  level  deeper 
into  the  branch-and-bound  tree,  t  :=  t  +  1.  If  the  partial  solution  is  not  promising,  the 
method  presented  in  Section  2.1  is  used  to  select  nodes  in  1 that  have  not  been  traversed. 
If  such  a  node  is  found,  the  search  process  is  restarted,  otherwise  backtracking  is  performed 
by  simply  updating  t  :=  t  —  1.  Note  that  due  to  the  special  structure  of  Z,  BitCLQ  does 
not  need  to  restore  the  set  of  compatible  nodes  during  backtracking,  in  contrast  to  the 
update  procedure  (6)  for  the  set  Cs  that  is  performed  in  FINDCLIQUE. 

2.3  Example 

As  an  illustration,  consider  the  3-partite  graph  that  is  shown  along  with  its  adjacency 
matrix  M  in  Figure  1,  where  the  partite  1  consists  of  nodes  {1,2,3},  partite  2  contains 
nodes  {4,5,6},  and  partite  3  contains  nodes  {7,8,9}.  BitCLQ  is  initialized  by  setting 
S  :=  0,  Bs  :=  {1,2,3}  and  Z\  :=  {111|  111)111}.  Since  all  the  partites  are  of  the  same 
size,  i.e.  \Z\^\  =  3  for  all  b  G  Bs,  the  one  to  branch  on  is  chosen  arbitrarily;  assume  that 
the  first  partite  Z u  is  chosen  for  branching.  The  search  process  from  this  point  restarts 
3  times,  each  time  adding  one  of  the  three  nodes  in  Z\,\.  The  first  node  to  add  to  S  is 
node  1,  Z\  \  \  is  then  set  to  0,  and  zQ  is  subsequently  created  by  performing  logical  AND 
operation  with  Z\  and  the  corresponding  row  of  the  adjacency  matrix  M  as  operands: 
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Algorithm  1  BitCLQ(i) 

1:  bt  G  argminb{|Z()ft|  |  b  G  Bs} 

2:  i  :=  the  first  significant  bit  in  Zt  bt 

3:  repeat 

4:  rif  :=  the  i-th  node  in  bt 

5:  Z t,b,i  -  =  0 

6:  S:=SU{nt} 

7:  if  | S\  =  k  then 

8:  Q:=QUS 

9:  S:=S\{nt} 

10:  else 

11:  zt+i,b  '■=  zt,b  A  M(nt )  forall  b  G  Bs 

12:  Bs  :=  Bs  U  {bt};  Bs-=Bs\{bt}  _ 

13:  P  :=  number  of  partitions  Zt^  with  \Zt^\  >  0,  b  G  Bs 

14:  if  |  S' |  +  P  =  k  then 

15:  BitCLQ  (t  +  1) 

16:  S:=S\{nt} 

17:  Bs  :=  Bs\{bt}-,  Bs  ■=  Bs  U  {bt} 

18:  else 

19:  S:=S\{nt} 

20:  Bs  ■=  Bs  \  {fri};  Bs  :=  Bs  U  {bt} 

21:  end  if 

22:  end  if 

23:  i  :=  the  first  significant  bit  in  ZtjH 

24:  until  i  <\Vbt\ 
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Figure  1:  A  3-partite  graph  and  its  adjacency  matrix. 


t:=  1, 

S  :={!}, 

Z2  :=  Zi  A  M(  1)  =  {011|111|111}  A  {000|  111|011}  =  {000|  1 1 1 1 01 1 }, 

Bs  :=  {2,3}. 

As  a  result,  the  set  Z2  of  nodes  compatible  with  the  partial  solution  S  =  {1}  contains 
nodes  {4,  5,  6,  8,  9}.  Since  none  of  the  partites  in  Bs  is  empty,  the  partial  solution  S  is 
promising  and  a  new  subproblem  is  created.  The  objective  in  the  new  subproblem  is  to 
find  a  (.Bs  (-clique  in  Z2.  A  node  from  Z2j 3  will  be  added  to  S  (since  | ^2,3 1  <  | -^2,2 1 ) •  The 
first  node  in  Z2) 3  to  add  to  the  partial  solution  is  node  8.  The  bit  corresponding  to  node 
8  is  set  Z2j 3)2  :=  0,  and  we  have 

t:=  2, 

S'  :=  {1,8}, 

Z3  :=  Z2  A  M( 8)  =  {000(111(001}  A  {111(001(000}  =  {000(001(000}, 

BS  :=  {2}. 

Again,  the  partites  in  Bs  contain  at  least  1  node  (node  6)  in  Z3.  So  the  partial  solution 
is  promising,  and  a  new  subproblem  is  created.  In  the  next  step,  node  5  is  added  to  S: 


t  :=  3, 

S  :=  {1,8,6}. 

At  this  point,  since  (Sj  =  k  =  3,  i.e. ,  a  /c-clique  is  found.  To  continue  the  search  for  other 
fc-cliques,  the  last  node  in  S  is  removed.  BitCLQ  searches  Z3j2  for  another  node  that  can 
be  added  to  S.  Since  such  a  node  does  not  exist,  the  algorithm  backtracks:  t  :=  2,  node 
8  is  removed  from  5,  and  BitCLQ  restarts  with  5  =  {1,  9}  as  the  partial  solution. 
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Table  1:  Average  computational  time  (in  seconds)  to  find  all  the  /c-cliques  (#CLQ)  con¬ 
tained  in  randomly  generated  /c-partite  graphs. 


k 

m 

\V\ 

P 

#CLQ 

FINDCLIQUE 

BitCLQ 

3 

100 

300 

0.1 

1004 

0.005 

0.002 

4 

100 

400 

0.15 

1124 

0.008 

0.002 

5 

100 

500 

0.2 

1047 

0.015 

0.003 

6 

100 

600 

0.25 

939 

0.031 

0.006 

7 

50 

350 

0.35 

192 

0.009 

0.004 

8 

50 

400 

0.4 

299 

0.021 

0.007 

9 

50 

450 

0.45 

683 

0.055 

0.021 

10 

50 

500 

0.5 

2672 

0.176 

0.071 

3  Numerical  Results 

In  order  to  illustrate  the  performance  of  the  proposed  method,  the  /c-clique  enumeration 
problem  for  /c-partite  graphs  has  been  solved  by  BitCLQ  and  FINDCLIQUE  for  randomly 
generated  graph  instances  of  several  types.  Both  algorithms  were  implemented  in  C++ 
and  ran  on  a  64-bit  Windows  machine  with  3GHz  dual-core  processor  and  4GB  of  RAM.  It 
is  worth  noting  that  the  original  implementation  of  FINDCLIQUE  algorithm  by  Grunert 
et  al  [3]  relies  on  the  use  of  vectors  and  links  data  types  from  the  C++  standard 
template  library  (STL).  In  our  experiments,  we  observed  that  by  replacing  the  original  data 
structure  of  vectors  of  lists  with  arrays,  up  to  300%  improvement  in  FINDCLIQUE  running 
time  is  achieved  on  the  data  sets  used  in  our  case  study.  The  numerical  results  reported 
for  the  FINDCLIQUE  algorithm  are  obtained  using  this  “improved”  implementation. 

Our  numerical  experiments  involve  randomly  generated  instances  of  /c-partite  graphs 
of  two  types.  The  first  set  of  instances  consists  of  two  groups:  small-size  instances  and 
large-size  instances.  In  the  small-size  instances,  /c-partite  graphs  are  randomly  generated 
with  the  number  of  partites  in  the  range  k  E  [3, 10].  For  each  value  of  k,  the  reported 
running  times  and  the  number  of  /c-cliques  in  the  graph  are  averaged  over  10  instances. 
Table  1  shows  the  summary  of  the  experimental  results  for  this  first  group.  The  columns  of 
the  table  show  the  number  k  of  partites  in  the  /c-partite  graph,  the  number  rn  of  nodes  in 
each  partite  of  the  graph,  the  total  number  |V|  of  nodes  in  the  graph,  the  graph’s  density 
p.  and  the  total  number  of  /c-cliques  in  the  graph  (#CLQ).  The  density  parameter  p  is 
used  for  generation  of  the  graphs,  and  is  equal  to  the  probability  of  an  edge  connecting 
two  nodes  from  different  partites:  Pr  {(vi,  Vj)  E  E}  =  p. 

The  second  group  include  instances  of  larger  size  with  the  values  of  k  E  {25,  50,  75, 100}. 
For  each  value  of  k  in  this  group,  10  random  instances  of  the  /c-partite  graph  have  been 
generated  and  solved  by  FINDCLIQUE  and  BitCLQ.  Table  2  summarizes  the  results  of 
the  experiments  for  this  group.  Since  the  graphs  used  in  this  set  of  experiments  are  rather 
large  and  the  list  of  all  /c-cliques  contained  in  them  may  not  be  found  in  a  reasonable  time, 
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Table  2:  Average  number  of  fc-cliques  found  in  randomly  generated  instances  of  fc-partite 
graphs  after  200  seconds. 


k 

m 

IU 

P 

time 

FINDCLIQUE 

BitCLQ 

25 

40 

1000 

0.8 

200 

13,556,733 

23,516,581 

50 

30 

1500 

0.9 

200 

800,369 

1,032,111 

75 

30 

2250 

0.95 

200 

557,042,389 

735,722,241 

100 

30 

3000 

0.95 

200 

348,416 

365,799 

Table  3:  Average  computational  time  (in  seconds)  needed  to  find  the  first  n-clique  in  an  n- 
partite  graph  corresponding  to  a  randomized  instance  of  the  Multidimensional  Assignment 
Problem  with  d  dimensions  and  n  elements  per  dimension. 


n 

d 

m 

M 

P 

BitCLQ 

FINDCLIQUE 

10 

3 

10 

100 

0.74 

0.00 

0.00 

20 

3 

12 

240 

0.86 

0.00 

0.00 

30 

3 

13 

390 

0.91 

0.02 

0.00 

40 

3 

13 

520 

0.93 

0.76 

1.38 

50 

3 

14 

700 

0.94 

0.42 

0.42 

60 

3 

14 

840 

0.95 

55.28 

86.87 

70 

3 

14 

980 

0.96 

251.78 

395.34 

10 

4 

22 

220 

0.65 

0.00 

0.00 

20 

4 

28 

480 

0.82 

0.08 

0.20 

30 

4 

31 

930 

0.87 

8.18 

22.41 

10 

5 

48 

480 

0.59 

0.00 

0.01 

20 

5 

68 

1360 

0.77 

13.29 

28.23 

the  solution  process  has  been  terminated  after  200  seconds  and  the  number  of  fc-cliques 
found  by  each  method  was  recorded.  BitCLQ  outperformed  FINDCLIQUE  in  all  cases. 

The  third  set  of  experiments  was  conducted  to  compare  the  performance  of  BitCLQ 
with  FINDCLIQUE  on  randomly  generated  instances  of  Multidimensional  Assignment 
Problem  (MAP).  As  was  mentioned  before,  high-quality  solutions  for  randomized  MAPs 
can  be  obtained  as  n-cliques  in  an  n-partite  subgraph  of  the  underlying  graph  representing 
the  MAP  instance,  graphs  that  are  constructed  in  a  special  way  from  the  problem’s  data 
(in  this  case,  m  denotes  the  number  of  elements  per  dimension  in  a  d- dimensional  MAP). 
For  MAPs  with  random  iid  costs,  the  resulting  n-partite  graph  can  be  viewed  as  randomly 
generated  with  a  certain  density.  The  corresponding  results  are  reported  in  Table  3,  where 
n  denotes  the  number  of  partitions  in  the  graphs,  and  d  is  the  number  of  dimensions  in 
the  MAP.  For  each  value  of  n,  10  instances  are  solved,  and  the  computational  time  to  find 
the  first  n-clique  is  recorded.  Algorithms  are  terminated  after  finding  the  first  n-clique. 
The  average  computational  time  over  10  runs  is  reported  for  each  n  for  each  algorithm. 
In  all  cases  but  one,  BitCLQ  performs  better  or  equally  well  compared  to  FINDCLIQUE. 
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4  Conclusions 


In  this  paper,  bitset-based  data  structures  are  proposed  for  the  algorithm  presented  by 
Grunert  et  al  [3]  for  the  problem  of  enumerating  all  fc-cliques  in  a  fc-partite  graph.  Utiliza¬ 
tion  of  bitsets  and  the  associated  bit  parallelism  enables  one  to  reduce  the  computational 
cost  of  branching  and  backtracking  in  the  branch-and-bound  procedure.  Numerical  ex¬ 
periments  on  small-  and  large-scale  randomly  generated  fc-partite  graphs  show  that  the 
proposed  approach  allows  for  achieving  substantial  computational  improvements  over  the 
original  method  of  [3]. 
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stochastic  programming  problems  with  a  class  of  downside  risk  measures,  INFORMS  Journal  on 
Computing,  27(2),  41 6-430. 

Vinel,  A.  and  P.  Krokhmal  (201 5)  Certainty  equivalent  measures  of  risk,  Annals  of  Operations  Research, 

DOM  0.1 007/si  0479-01 5-1 801-0. 

Chernikov,  D.,  Krokhmal,  P.,  Zhupanska,  O.  I.,  and  C.  L.  Pasiliao  (201 5)  A  two-stage  stochastic  PDE- 
constrained  optimization  approach  to  vibration  control  of  an  electrically  conductive  composite  plate 
subjected  to  mechanical  and  electromagnetic  loads,  Structural  and  Multidisciplinary  Optimization,  52(2), 
227-352. 

Vinel,  A.  and  P.  Krokhmal  (2014)  Polyhedral  approximations  in  p-ordercone  programming,  Optimization 
Methods  and  Software,  29(6),  1210-1 237. 

Rysz,  M.,  Mirghorbani,  M.,  Krokhmal,  P.  and  E.  L.  Pasiliao  (2014)  On  risk-averse  maximum  weighted 
subgraph  problems,  Journal  of  Combinatorial  Optimization,  28(1 ),  1 67-1 85. 

Vinel,  A.  and  P.  Krokhmal  (2014)  On  valid  inequalities  for  mixed  integer  p-order  cone  programming, 

Journal  of  Optimization  Theory  and  Applications,  1 60(2),  439-456. 

Rysz,  M. ,  Krokhmal,  P.,  and  E.L.  Pasiliao  (201 3)  Minimum  risk  maximum  clique  problem,  in:  A.  Sorokin  and 
P.  M.  Pardalos  (Eds),  Dynamics  of  Information  Systems:  Algorithmic  Approaches,  Springer  Proceedings  in 
Mathematics  &  Statistics,  vol.  51 , 251-267 

Morenko,  Y.,  Vinel,  A.,  Yu,  Z.,  and  P.  Krokhmal  (2013)  On  p-norm  linear  discrimination,  European  Journal  of 
Operational  Research,  231  (3),  784-789. 

Rysz,  M.,  Pajouh,  F.,  Krokhmal,  P.  and  E.  L.  Pasiliao  (2014)  On  risk-averse  weighted  k-club  problems, 
Examining  Robustness  and  Vulnerability  of  Critical  Infrastructure  Networks,  NATO  Science  for  Peace  and 
Security  Series  -  D:  Information  and  Communication  Security,  vol.  37,  231  -242. 

Mirghorbani,  M.  and  P.  Krokhmal  (201 3)  On  finding  k-cliques  in  k-partite  graphs,  Optimization  Letters,  7(6), 
1155-1165. 
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Vinel,  A.  and  P.  Krokhmal  (201 5)  Mixed-Integer  Programming  with  a  Class  of  Nonlinear  Convex 
Constraints,  under  review  in  Discrete  Optimization. 

Rysz,  M.,  Krokhmal,  P.,  and  E.  L.  Pasiliao  (201 5)  Identifying  resilient  structures  in  stochastic  networks:  A 
two-stage  stochastic  optimization  approach,  under  review  in  Networks. 

Rysz,  M.,  Pajouh,  F.,  Krokhmal,  P.,  and  E.  L.  Pasiliao  (2015)  Identifying  risk-averse  low-diameter  clusters  in 
graphs  with  stochastic  vertex  weights,  under  review  in  Annals  of  Operations  Research. 

Changes  in  research  objectives  (if  any): 

During  the  last  project  period,  more  emphasis  has  been  placed  on  the  development  of  solution  methods  for 
risk-averse  combinatorial  problems. 

Change  in  AFOSR  Program  Manager,  if  any: 

Dr.  Donald  Hearn  was  replaced  by  Dr.  Fariba  Fahroo,  who  was  replaced  by  Dr.  Jean-Luc  Cambier. 

Extensions  granted  or  milestones  slipped,  if  any: 

A  no-cost  extension  for  the  period  from  04/01/2015  to  12/31/2015  was  requested  and  granted.  The  no-cost 
extension  was  requested  due  to  the  fact  that  the  PI,  Dr.  Pavlo  Krokhmal,  was  on  sabbatical  leave  in  201 5, 
during  which  he  was  awarded  the  National  Research  Council  Senior  Research  Associateship  Award  that, 
as  a  condition,  required  that  the  recipient  did  not  conduct  conducting  research  on  other  grants  during  the 
period  of  the  award. 
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